PTO/SB/05 (08-00) 

E Approved for use through 10/31/2002 OMB 0651-0032 
U S Patent and Trademark Office, U S. DEPARTMENT OF COMMERCE 
-u. Under the Paperwork Reduction Act of 1995, no persons are required to respond to a collection of information unless it displays a valid OMB control number 



Please type a plus sign (+) inside this box 




UTILITY 
PATENT APPLICATION 
TRANSMITTAL 



Attorney Docket No. 



First Inventor 



RPMS 101 CON(3) 



p 



David William Holden 



Title IDENTIFICATION OF GENES 



Express Mail Label No. 



APPLICATION ELEMENTS 

See MPEP chapter 600 concerning utility patent application contents. 



EL 381 202 131 US 



Q 

mis 



•Sr^t — > ===== 



ADDRESS TO: 



Assistant Commissioner for Patents 
Box Patent Application 
Washington, DC 20231 



ry~l Fee Transmittal Form (e.g., PTO/SB/17) 

1 . I 1 (Submit an original and a duplicate for fee proce v, mg) 

0 I I Applicant claims small entity status. 

z - 1 1 See 37 CFR 1.27. 

I x I Specification [Total Pages [262] ' 

3- I 1 (preferred arrangement set forth below) 

- Descriptive title of the invention 

- Cross Reference to Related Applications 

- Statement Regarding Fed sponsored R&D 

- Reference to sequence listing, a table, 
or a computer program listing appendix 

- Background of the Invention 

- Brief Summary of the Invention 

- Brief Description of the Drawings (if filed) 

- Detailed Description 

- Ciaim(s) 

- Abstract of the Disclosure 



7. Q CD-ROM or CD-R in duplicate, large table or 

Computer Program (Appendix) 

8. Nucleotide and/or Amino Acid Sequence Submission 
(if applicable, all necessary) 

Computer Readable Form (CRF) 



a. 
b 



Specification Sequence Listing on: 
i. □ CD-ROM or CD-R (2 copies); or 



i i.ls paper 

c. | X | Statements verifying identity of above copies 



[X~l Drawing(s) (35 U S.C. 113) [ Total Sheets EH] ] 
Oath or Declaration [Total Pages I 2 I ] 

Newly executed (original or copy) 

ECopy from a prior application (37 CFR 1 .63 (d)) 
(for continuation/divisional with Box 17 completed) 

i □ DELETION OF INVENTOR(S) 



a. 

b. 



□ 



Signed statement attached deleting inventor(s) 
named in the prior application see 37 CFR 
1 63(d)(2) and 1 33(b) 

Application Data Sheet. See 37 CFR 1 .76 



ACCOMPANYING APPLICATION PARTS 



«■□ 

12. □ 

13- S 

14. a 

15 □ 
16- E 



Citations 



Assignment Papers (cover sheet & document(s)) 

37 CFR 3.73(b) Statement I 1 Power of 

(when there is an assignee) ' — ' Attorney 

English Translation Document (if applicable) 

Information Disclosure I I Copies of IDS 

Statement (I DS)/PTO-1 449 1 1 

Preliminary Amendment 

Return Receipt Postcard (MPEP 503) 
(Should be specifically itemized) 

Certified Copy of Priority Document(s) 
(if foreign priority is claimed) 

Other Cneck for $988.00 



1 7 If a CONTINUING APPLICATION, check appropriate box, and supply the requisite information below and in a preliminary amendment, 
or in an Application Data Sheet under 37 CFR 1. 76: 

0 Continuation □ Divisional □ Continuation-in-part (CIP) of prior application No 09 / 201 ,945 



Prior application information 



Examiner Robert Schwartzman 



Group /Art Unit 1636 



For CONTINUATION OR DIVISIONAL APPS only: The entire disclosure of the prior application, from which an oath or declaration is supplied under 



Box 5b, is considered a part of the disclosure of the accoi 
The incorporation can only be relied upon when a portion 




jpn or divisional application and is hereby incorporated by reference, 
ly omitted from the submitted application parts. 



E ADDRESS 



Customer Number or Bar Code Label 



23579 



or i I Correspondence address below 



Name 



Address 



City 



Patrea L. Pabst 

Arnall Golden & Gregory, LLP 



2800 One Atlantic Center 



1201 West Peachtree Street 



Atlanta 



State 



GA 



Zip Code 



30309-3450 



Name (Print/Type) 


Robert A. h^odjjjes ^ ^ 


Registration No. (Attorney/Agent) 


41,074 


L Signature 


MVf J , 


Date November 16, 2000 J 



-at 

EO 
?\ 

iH 
H 



the amount of time you are required to complete this form should be sent to the Chief Information Officer, U S. Patent and Trademark Office, Washington DC 
20231 DO NOT SEND FEES OR COMPLETED FORMS TO THIS ADDRESS SEND TO Assistant Commissioner for Patents, Box Patent Application 
Washington, DC 20231 



r 



PTO/SB/17 (08-00) 
Approved for use through 10/31/2002 OMB 0651-0032 
U.S Patent and Trademark Office; U S. DEPARTMENT OF COMMERCE 

JJnderJhe^a£e^w>ri^ 



FEE TRANSMITTAL 
for FY 2001 



Patent fees are subject to annual revision 
Express Mail Label No.: EL 381 202 131 US 



Complete if Known 



Application Number 



Filing Date 



First Named Inventor 



Examiner Name 



TOTAL AMOUNT OF PAYMENT 



($)988.00 



Group Art Unit 



Attorney Docket No. 



Not Yet Assigned 



November 16, 2000 



w3? 



David William Holden 



Not Yet Assigned 



Not Yet Assigned 



RPMS101 CON(3) 



METHOD OF PAYMENT (check one) 



FEE CALCULATION (continued) 



1.D 



The Commissioner is hereby authorized to charge 
indicated fees and credit any overpayments to: 
Deposit 



Account 
Number 

Deposit 
Account 
Name 



01-2507 



Arnail Golden & Gregory 



r»7] Charge Any Additional Fee Required 
Under 37 CFR 1 16 and 1 1 7 

I I Applicant claims small entity status 
1 — 1 See 37 CFR 1 27 



2- [Xj Payment Enclosed: 

E Check □ Credit card fj 



□ 



Other 



FEE CALCULATION 



BASIC FILING FEE 

Large Entity Small Entity 



Fee 


Fee 


Fee 


Fee 


Fee Description 


Code {$) 


Code ($) 




101 


710 


201 


355 


Utility filing fee 


106 


320 


206 


160 


Design filing fee 


107 


490 


207 


245 


Plant filing fee 


108 


710 


208 


355 


Reissue filing fee 


114 


150 


214 


75 


Provisional filing fee 



Fee Paid 



710.00 



SUBTOTAL (1) ($) 710.00 



2. EXTRA CLAIM FEES 



31 



Total Claims 
Independent 
Claims 
Multiple Dependent 



A3 - 



Extra Claims 

X 
X 



11 

JZl 



Fee from 
below 



Fee Paid 



"l^OCl = fl9OT" 

jooa=i80M 

|^00~ 



Large Entity Small Entity 



Fee Fee 
Code ($) 


Fee Fee 
Code ($) 


Fee Description 


103 18 


203 Sf 


Claims in excess of 20 


102 80 


202 40 


Independent claims in excess of 3 


104 270 


204 135 


Multiple dependent claim, if not paid 


109 80 


209 40 


** Reissue independent claims 
over original patent 


110 18 


210 9 


** Reissue claims in excess of 20 



and over original patent 
SUBTOTAL (2) 



($) 278.00 



3. ADDITIONAL FEES 

Large Entity Small Entity 
Fee Fee Fee Fee 
Code ($) Code ($) 

105 



127 



130 
50 



205 
227 



65 
25 



Fee Description 

Surcharge - late filing fee or oath 



Fee Paid 



Surcharge - late provisional filing fee or 
cover sheet 



139 130 139 130 
147 2,520 147 2,520 



Non-English specification 
For filing a request for ex parte reexaminatior 



112 


920* 


112 


920* 


Requesting publication of SIR prior to 
Examiner action 


113 


1,840* 


113 


1 840* Requesting publication of SIR after 








Examiner action 


115 


110 


215 


55 


Extension for reply within first month 


116 


390 


216 


195 


Extension for reply within second month 


117 


890 


217 


445 


Extension for reply within third month 


118 


1,390 


218 


695 


Extension for reply within fourth month 


128 


1,890 


228 


945 


Extension for reply within fifth month 


119 


310 


219 


155 


Notice of Appeal 


120 


310 


220 


155 


Filing a brief in support of an appeal 


121 


270 


221 


135 


Request for oral hearing 


138 


1,510 


138 1,510 


Petition to institute a public use proceeding 


140 


110 


240 


55 


Petition to revive - unavoidable 


141 


1,240 


241 


620 


Petition to revive - unintentional 


142 


1.240 


242 


620 


Utility issue fee (or reissue) 


143 


440 


243 


220 


Design issue fee 


144 


600 


244 


300 


Plant issue fee 


122 


130 


122 


130 


Petitions to the Commissioner 


123 


50 


123 


50 


Petitions related to provisional applications 


126 


240 


126 


240 


Submission of Information Disclosure Stmt 


581 


40 


581 


40 


Recording each patent assignment per 
property (times number of properties) 


146 


710 


246 


355 


Filing a submission after final rejection 
(37 CFR § 1 129(a)) 


149 


710 


249 


355 


For each additional invention to be 
examined (37 CFR § 1 129(b)) 


179 


710 


279 


355 


Request for Continued Examination (RCE) 



169 900 169 900 



Other fee (specify) . 



Request for expedited examination 
of a design application 



Reduced by Basic Filing Fee Paid 



SUBTOTAL (3) 



($)0.00 



SUBMITTED BY 



/ 



Complete (if applicable) 



Name (Print/Type) 



Robert , 




Registration No. 
(Attorney/Agent) 



41 ,074 



Telephone (404)873-8796 



Signature 



Date 



November 16, 2000 



WARNING: Information 7 on this form may become public. Credit card information should not 
be included on this form. Provide credit card information and authorization on PTO-2038. 



Burden Hour Statement This form is estimated to take 0.2 hours to complete Time will vary depending upon the needs of the individual case Any comments on 
the amount of time you are required to complete this form should be sent to the Chief Information Officer, U S Patent and Trademark Office Washington DC 
20231 DO NOT SEND FEES OR COMPLETED FORMS TO THIS ADDRESS SEND TO Assistant Commissioner for Patents, Washington DC 20231 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 




Applicant: David William Holden 
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Filed: 



November 16, 2000 



Date of Deposit: 
November 16, 2000 



For: 



IDENTIFICATION OF GENES 



BOX PATENT APPLICATION 
Assistant Commissioner for Patents 
Washington, D.C. 20231 



REQUEST FOR FILING A 
CONTINUATION APPLICATION UNDER 37 C.F.R. § 1.53(b) 



Sir: 



Pursuant to 35 U.S.C. §21(a)as amended by Public Law 97-247 and 37 C.F.R. § 1.10, 
David William Holden encloses for filing the attached patent application entitled "Identification 
of Genes." The application includes 1 page of Abstract, 83 pages of Specification, 8 pages of 
claims, 1 12 sheets of Formal Drawings, 170 pages of Sequence Listing, and a copy of an 
executed Declaration for Patent Application. This application is a continuation of pending prior 
application Serial No. 09/201,945 filed December 1, 1998, entitled "Identification of Genes", by 
David William Holden, which is a continuation of prior Application No. 08/871,355, filed 
June 9, 1997, which is a continuation of 08/637,759, filed December 11, 1995, which is a 371 of 
PCT/GB95/02875, filed December 11, 1995. 
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Continuation of U.S.S.N. 09/201,945 
Applicant: David William Holden 
Date of Deposit: November 16, 2000 
REQUEST FOR FILING A CONTINUATION 
APPLICATION UNDER 37 C.F.R. § 1.53(b) 
Express Mail Label No.: EL 381 202 131 US 

Submitted with the above-identified application are (1) a check in the amount of $988.00 
to cover the filing fee, (2) a Preliminary Amendment, (3) a copy of the assignment from David 
William Holden to RPMS Technology Limited as filed in prior application Serial No. 
08/637,759, and recorded at Reel 9113, Frame 0723 on July 19, 1997, (4) a copy of the 
assignment from RPMS Technology Limited to Imperial College Innovations Limited as filed in 
prior Application Serial Nos. 08/637,759, 08/871,355, and 09/201,945, and recorded at Reel 
010113, Frame 0746 on July 26, 1999; (5) executed Combined Declaration for Patent 
Application, filed in Application Serial Nos. 08/737,759 and 09/201,945; (6) Statement Under 37 
C.F.R. 3.73(b), (7) Associate Power of Attorney Under 37 C.F.R. § 1.34; and (8) Fee 
Transmittal. 

Please preliminarily amend the application in accordance with the Preliminary 
Amendment. 

It is believed that $988.00 is the proper filing fee since the application will include 4 
independent claims and a total of 31 claims after entry of the Preliminary Amendment. The 
Commissioner is hereby authorized to charge any additional fees, which may be required, or 
credit any overpayment to Account No. 01-2507. To facilitate this process, applicant has 
enclosed a duplicate of this letter. 

Pursuant to 37 C.F.R. § 1.63(d), the copy of the executed Declaration included in the 
above-identified application is a copy of the executed Declaration filed in parent Application 
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20001/10 

1289592vl 2 



Continuation of U.S.S.N. 09/201,945 
Applicant: David William Holden 
Date of Deposit: November 16 , 2000 
REQUEST FOR FILING A CONTINUATION 
APPLICATION UNDER 37 C.F.R. § 1.53(b) 
Express Mail Label No.: EL 381 202 131 US 

Serial No. 09/201,945, to which the present application claims benefit. The power of attorney in 
the prior application is to Patrea L. Pabst, Madeline I. Johnston, and Dolly A. Vance. An 
Associate Power of Attorney is enclosed. The inventorship for the claims in the present 
application differs from the inventorship in the parent application, Serial No. 09/201,945, in that 
David William Holden is the sole inventor of the claims in the present Application, following 
entry of the Preliminary Amendment. 

The subject matter of this application is also related to the subject matter of Application 
Serial No. 08/637,759, filed December 11, 1995, Application Serial No. 08/871,355, filed on 
June 9, 1997, and Application Serial No. 09/201,945, filed December 1, 1998, by David William 
Holden. 

This application contains nucleic acid and/or protein sequences as defined in 37 C.F.R. 
§ 1.821-1.825. The sequence listing for the new application is identical to the sequence listing 
for application Serial No. 08/637,759, filed December 1 1, 1995, by David William Holden. 

Sequence Listings in computer readable form were submitted in Application Serial No. 
08/871,355, filed June 9, 1997, entitled "Identification of Genes", by David William Holden on 
October 31, 1997 and January 26,1999. Accordingly, pursuant to 37 C.F.R. § 1.821(e), applicant 
hereby requests that the computer readable form of the sequence listing submitted on January 26, 
1999, in application Serial No. 08/871,355 be used as the computer readable form of the 
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APPLICATION UNDER 37 C.F.R. § 1.53(b) 
Express Mail Label No.: EL 381 202 131 US 

sequence listing for the above-identified application. The application has a paper copy of the 
Sequence Listing incorporated therein. 

I declare that the paper copy of the Sequence Listing in the present application is 
identical to the material in the prior sequence listing, and that the Sequence Listing does not add 
new matter to the application, and that all statements made on information and belief are 
believed to be true and further that these statements were made with the knowledge that willful 
false statements may jeopardize the validity of the application or any patent issuing thereon. 

This application is being filed on November 16, 2000, by mailing the application to Box 
Patent Application, Commissioner for Patents and Trademarks, Washington, D.C. 20231 via the 
United States Postal Service "Express Mail Post Office to Addressee" Service under 37 C.F.R. § 
1.10. The Express Mail Label No. EL 381 202 131 US appears in the heading of this paper, 
which is attached to the application, pursuant to 37 C.F.R. § 1.10(b). 
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Applicant: David William Holden 
Date of Deposit: November 16, 2000 
REQUEST FOR FILING A CONTINUATION 
APPLICATION UNDER 37 C.F.R. § 1.53(b) 
Express Mail Label No.: EL 381 202 131 US 



All correspondence concerning this application should be mailed to: 



Patrea L. Pabst, Esq. 

ARNALL GOLDEN & GREGORY, LLP 
2800 One Atlantic Center 
1201 West Peachtree Street 
Atlanta, GA 30309-3450 



Date: November 1 6, 2000 

ARNALL, GOLDEN & GREGORY, LLP 

2800 One Atlantic Center 

1201 West Peachtree Street 

Atlanta, Georgia 30309-3450 

(404) 873-8796 

(404) 873-8797 (fax) 



Respectfully submitted, 




Robert A. Hodges 
Reg. No. 41,074 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Applicant: David William Holden 

Serial No. : Continuation of Express Mail Label No. 

09/201,945 EL 381 202 131 US 

Filed: November 16, 2000 Date of Deposit: November 16, 2000 

For: IDENTIFICATION OF GENES 

Box Patent Application 

Assistant Commissioner for Patents 

Washington, D.C. 20231 

PRELIMINARY AMENDMENT 

Sir: 

Prior to examination, please amend the application as follows. This Preliminary 
Amendment is being filed along with a Continuation Application filed under 37 C.F.R. § 1.53(b). 
It is believed that no fee is required with this Amendment. However, should a fee be required, 
the Commissioner is hereby authorized to charge any required fees to Deposit Account No. 
01-2507. 

Amendment 
In the Specification 

On page 1, after the title "IDENTIFICATION OF GENES" and before line 3, which 
begins "The present invention" please insert as a new paragraph 

This application is a continuation of copending Application Serial No. 09/201 ,945, filed 
December 1, 1998, entitled "IDENTIFICATION OF GENES," by David William Holden, which 
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U.S.S.N. Continuation of 09/20 1,945 
Express Mail Label No.: EL 381 202 131 US 
Date of Deposit: November 16, 2000 
PRELIMINARY AMENDMENT 

is a continuation of 08/637,759, filed July 19, 1997, which is a 371 of PCT/GB95/02875, filed 
December 11, 1995. Application Serial No. 09/201,945, filed December 1, 1998, and 
Application Serial No. 08/637,759, filed July 19, 1997, are hereby incorporated herein by 
reference. - 

On page 22, line 13, following "5 and 6" please add --(SEQ ID Nos 39-44 and 8-36)--. 

On page 22, line 20, following "1 1 and 12" please add --(SEQ ID Nos 37 and 38)-. 

On page 39, lines 19-20, please replace "12301 Parklawn Drive, Rockville, Maryland 
20852" with --10801 University Boulevard, Manassas, Virginia 201 10-2209-. 

On page 43, line 7, please replace "Figure 1 illustrates" with -Figures 1 A and IB 
illustrate-. 

On page 43, line 29, please replace "Figure 5 shows" with -Figures 5A through 5D 

show-. 

On page 44, line 1, following "genome" please add -(SEQ ID Nos 39 and 40)-. 
On page 44, line 3, please replace "Figure 6 shows" with -Figures 6A through 6H 

show-. 

On page 44, line 6, please replace "Figure 7 shows" with -Figure 7A and 7B show-. 
On page 44, line 20, please replace "Figure 8 shows" with -Figures 8A, 8B, and 8C 

show-. 

On page 45, line 19, please replace "Figure 10 shows" with -Figures 10A and 10B 

show-. 
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Date of Deposit: November 16, 2000 
PRELIMINARY AMENDMENT 

On page 45, line 27, please replace "Figure 1 1 shows" with -Figures 1 1 A through 1 1BW 

show--. 

On page 45, line 29, please replace "2" with -8-. 

On page 45, line 29, please replace "all six" with -three forward-. 

On page 45, line 29, following "reading frames" please add -(the amino acid sequences 
in the "a" reading frame are SEQ ID Nos. 45-187, the amino acid sequences in the "b" reading 
frame are SEQ ID Nos. 188-356, and the amino acid sequences in the "c" reading frame are SEQ 
ID Nos. 357-501)-. 

On page 46, line 6, please replace "Figure 12 shows" with -Figures 12A through 12P 

show-. 

On page 46, line 7, please replace "Figure 2" with -Figure 8-. 

On page 46, lines 7 and 8, please delete "DNA is translated in all six reading frames and 

the". 

On page 56, line 28, following "Figure 5" please add -; SEQ ID Nos 39 and 40-. 
On page 57, line 2, following "Bl to B5" please add -; SEQ ID Nos 8-36-. 

In the Claims 
Please amend the claims as follows. 

3. (Amended) [A] The method according to [Claims 1 or 2] Claim 57 further comprising 
[the steps: 
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(1 A)] after step (a), removing auxotrophs from the plurality of [mutants produced in step 

(1); or 

(6A) determining whether the mutant selected in step (6) is an auxotroph; or 
both (1A) and (6 A)] mutant microorganisms . 
Please add the following new claims. 

57. (New) A method for identifying a mutant microorganism having a reduced 
adaptation to a particular environment comprising the steps of 

(a) providing a plurality of mutant microorganisms wherein each microorganism contains 
a different marker sequence; 

(b) introducing the plurality of microorganisms of step (a) into the said particular 
environment and allowing those microorganisms which are able to do so to grow in the said 
environment; 

(c) retrieving microorganisms from the said environment or a selected part thereof; and 

(d) selecting an individual microorganism having a reduced capacity to proliferate in the 
particular environment by comparing any marker sequences in the nucleic acid present in the 
retrieved microorganisms in step (c) to the different marker sequences referred to in step (a). 

58. (New) The method of Claim 57 for identifying a gene which allows a microorganism 
to adapt to a particular environment further comprising the step: 

(e) identifying the gene which is mutated in the individual microorganism having a 
reduced capacity to proliferate in the particular environment. 

1295948vl 4 RPMS101CON(3) 

20001/10 



U.S.S.N. Continuation of 09/20 1,945 
Express Mail Label No.: EL 381 202 131 US 
Date of Deposit: November 16, 2000 
PRELIMINARY AMENDMENT 

59. (New) The method of Claim 58 for isolating a gene which allows a microorganism to 
adapt to a particular environment further comprising the step: 

(f) isolating from a wild-type microorganism the corresponding wild-type gene. 

60. (New) The method of Claim 59 wherein the particular environment is a differentiated 
multicellular organism. 

61. (New) The method of Claim 60 wherein the multicellular organism is a plant. 

62. (New) The method of Claim 61 wherein the microorganism is a bacterium 
pathogenic to plants. 

63. (New) The method of Claim 61 wherein the microorganism is a fungus pathogenic to 

plants. 

64. (New) The method of Claim 60 wherein the multicellular organism is a non-human 

animal. 

65. (New) The method of Claim 64 wherein the animal is selected from the group 
consisting of a mouse, rat, rabbit, dog and monkey. 

66. (New) The method of Claim 65 wherein the animal is a mouse. 

67. (New) The method of Claim 64 wherein the microorganism is a fungus pathogenic to 
animals. 

68. (New) The method of Claim 67 wherein the fungus is selected from the group 
consisting of Aspergillus spp., Cryptococcus neoformans and Histoplasma capsulatum. 
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69. (New) The method of Claim 64 wherein in step (b) the microorganisms are 
introduced orally, intravenously, intranasally, or intraperitoneally. 

70. (New) The method of Claim 69 wherein in step (c) the microorganisms are retrieved 
from the spleen. 

71 . (New) The method of Claim 64 wherein the microorganism is a bacterium 
pathogenic to animals. 

72. (New) The method of Claim 71 wherein the bacterium is selected from the group 
consisting of Bordetella pertussis, Campylobacter jejuni, Clostridium botulinum, Escherichia 
coli, Haemophilus decreyi, Haemophilus influenzae, Helicobacter pylori, Klebsiella 
pneumoniae, Legionella pneumophila, Listeria spp., Neisseria gonorrhoeae, Neisseria 
meningitidis, Pseudomonas spp., Salmonella spp., Shigella spp., Staphylococcus aureus, 
Streptococcus pyogenes, Streptococcus pneumoniae, Vibrio spp., and Yersinia pestis. 

73. (New) The method of Claim 60 wherein in step (c) the microorganisms are retrieved 
from the said environment at a site remote from the site of introduction in step (b). 

74. (New) A gene obtained by the method of Claim 59. 

75. (New) A mutant microorganism comprising a mutation in a gene identified using the 
method of Claim 58. 

76. (New) The method of Claim 57 wherein the microorganism is a bacterium. 

77. (New) The method of Claim 57 wherein the microorganism is a fungus. 
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78. (New) The method of Claim 57 wherein in step (d) the comparison of any marker 
sequences in the nucleic acid of the mutants retrieved in step (c) to the marker sequences referred 
to in step (a) uses DNA amplification techniques and oligonucleotide primers. 

79. (New) A mutant microorganism obtained by the method of Claim 57. 

80. (New) A non-human animal or plant, or an animal or plant cell culture, containing a 
plurality of mutant microorganisms wherein each mutant contains a different marker sequence. 

81 . (New) The non-human animal or plant, or an animal or plant cell culture, of Claim 
80 wherein the microorganism is a pathogenic mircroorganism. 

82. (New) A non-human animal or an animal cell culture containing a plurality of mutant 
microorganisms wherein each mutant contains a different marker sequence and wherein the 
microorganism is pathogenic to animals. 

83. (New) The non-human animal or an animal cell culture of Claim 82 wherein the 
microorganism is selected from the group consisting of Bordetella pertussis, Campylobacter 
jejuni, Clostridium botulinum, Escherichia coli, Haemophilus ducreyi, Haemophilus influenzae, 
Helicobacter pylori, Klebsiella pneumoniae, Legionella pneumophila, Listeria spp., Neisseria 
gonorrhoeae, Neisseria meningitidis, Pseudomonas spp., Salmonella spp., Shigella spp., 
Staphylococcus aureus, Streptococcus pyogenes, Streptococcus pneumoniae, Vibrio spp., and 
Yersinia pestis. 

84. (New) The non-human animal of Claim 83 which is a mouse or rat or rabbit or dog 
or monkey. 
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85. (New) The non-human animal of Claim 82 which is a mouse or rat or rabbit or dog 
or monkey. 

86. (New) A method for identifying a microorganism having a reduced adaptation to a 
particular environment comprising the steps of 

(a) providing a plurality of microorganisms wherein each microorganism contains a 
different marker sequence; 

(b) introducing the plurality of microorganisms of step (a) into the said particular 
environment and allowing those microorganisms which are able to do so to grow in the said 
environment; 

(c) retrieving microorganisms from the said environment or a selected part thereof; and 

(d) selecting an individual microorganism having a reduced capacity to proliferate in the 
particular environment by comparing any marker sequences in the nucleic acid present in the 
retrieved microorganisms in step (c) to the different marker sequences referred to in step (a). 

Please cancel claims 1, 2, and 4-56. 

Remarks 

Claims 3 and 57-86 are pending. Claim 3 has been amended. Claims 1, 2, and 4-56 have 
been canceled. Claims 57-86 are newly added. Claim 3 was amended to conform claim 3 to the 
language of new claim 57, from which it depends. New claims 57 and 86 recite forms of the 
disclosed method embodied in Example 1 (pages 46-56) and find support there. Example 1 
provides microorganisms containing different marker sequences. The microorganisms were 
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produced by introducing marker sequences into microorganisms. Claims 57 and 75 also find 
support in original claim 1. New claim 58 finds support at least on page 17, lines 1-2, and in 
original claim 4. Claims 59-79 find support at least in original claims 5-7, 16, 17, 8-10, 19, 12, 
13, 18, 20, 21,11, 30, 27, 14, 15, 25, and 26, respectively. New claim 69 also finds support on 
page 15, lines 18-24. New claims 80 and 82 find support at least in original claims 1, 7, and 8, 
and on page 15, lines 26-30. New claim 81 finds support at least in original claims 7, 8, 16, and 
17 and on page 15, lines 26-30, and page 4, lines 25-27. New claim 83 finds support at least in 
original claims 1, 7, 8, and 20, and on page 15, lines 26-30. New claims 84 and 85 find support 
at least in original claims 1, 7, 8, 9, and 20, and on page 15, lines 26-30. A copy of all of the 
pending claims as they are believed to have been amended is attached to this Amendment as an 
appendix. 

The specification has been amended to include reference to the parent applications and to 
annotate sequences in the specification. These amendments to the specification generally 
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correspond to amendments made during the course of prosecution of parent application Serial 
No. 09/201,945. 



Date: November 16, 2000 
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Label Number EL 381 202 131 US addressed to Box Patent Application, Assistant 
Commissioner for Patents, Washington, D.C. 20231. 
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Robert A. Hodges 
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Appendix: Claims As Pending After Amendment 

3. (Amended) [A] The method according to [Claims 1 or 2] Claim 57 further comprising 
[the steps: 

(1A)] after step (a), removing auxotrophs from the plurality of [mutants produced in step 

(l);or 

(6A) determining whether the mutant selected in step (6) is an auxotroph; or 
both (1 A) and (6 A)] mutant microorganisms . 

57. (New) A method for identifying a mutant microorganism having a reduced 
adaptation to a particular environment comprising the steps of 

(a) providing a plurality of mutant microorganisms wherein each mutant contains a 
different marker sequence; 

(b) introducing the plurality of mutants of step (a) into the said particular environment 
and allowing those microorganisms which are able to do so to grow in the said environment; 

(c) retrieving microorganisms from the said environment or a selected part thereof; and 

(d) selecting an individual mutant having a reduced capacity to proliferate in the 
particular environment by comparing any marker sequences in the nucleic acid present in the 
retrieved microorganisms in step (c) to the different marker sequences referred to in step (a). 

58. (New) The method of Claim 57 for identifying a gene which allows a microorganism 
to adapt to a particular environment further comprising the step: 

(e) identifying the gene which is mutated in the individual mutant having a reduced 
capacity to proliferate in the particular environment. 

59. (New) The method of Claim 58 for isolating a gene which allows a microorganism to 
adapt to a particular environment further comprising the step: 

(f) isolating from a wild-type microorganism the corresponding wild-type gene. 

60. (New) The method of Claim 59 wherein the particular environment is a differentiated 
multicellular organism. 

61 . (New) The method of Claim 60 wherein the multicellular organism is a plant. 
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62. (New) The method of Claim 61 wherein the microorganism is a bacterium 
pathogenic to plants. 

63. (New) The method of Claim 61 wherein the microorganism is a fungus pathogenic to 

plants. 

64. (New) The method of Claim 60 wherein the multicellular organism is a non-human 

animal. 

65. (New) The method of Claim 64 wherein the animal is selected from the group 
consisting of a mouse, rat, rabbit, dog and monkey. 

66. (New) The method of Claim 65 wherein the animal is a mouse. 

67. (New) The method of Claim 64 wherein the microorganisms is a fungus pathogenic 
to animals. 

68. (New) The method of Claim 67 wherein the fungus is selected from the group 
consisting of Aspergillus spp., Cryptococcus neoformans and Histoplasma capsulatum. 

69. (New) The method of Claim 64 wherein in step (b) the microorganisms are 
introduced orally, intravenously, intranasally, or intraperitoneally. 

70. (New) The method of Claim 69 wherein in step (c) the microorganisms are retrieved 
from the spleen. 

71 . (New) The method of Claim 64 wherein the microorganism is a bacterium 
pathogenic to animals. 

72. (New) The method of Claim 71 wherein the bacterium is selected from the group 
consisting of Bordetella pertussis, Campylobacter jejuni, Clostridium Botulinum, Escherichia 
coli, Haemophilus decreyi, Haemophilus influenzae, Helicobacter pylori, Klebsiella 
pneumoniae, Legionella pneumophila, Listeria spp., Neisseria gonorrhoeae, Neisseria 
meningitidis, Pseudomonas spp., Salmonella spp., Shigella spp., Staphylococcus aureus, 
Streptococcus pyogenes, Streptococcus pneumoniae, Vibrio spp., and Yersinia pestis. 

73. (New) The method of Claim 60 wherein in step (c) the microorganisms are retrieved 
from the said environment at a site remote from the site of introduction in step (b). 
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74. (New) A gene obtained by the method of Claim 59. 

75. (New) A mutant microorganism comprising a mutation in a gene identified using the 
method of Claim 58. 

76. (New) The method of Claim 57 wherein the microorganism is a bacterium. 

77. (New) The method of Claim 57 wherein the microorganism is a fungus. 

78. (New) The method of Claim 57 wherein in step (d) the comparison of any marker 
sequences in the nucleic acid of the mutants retrieved in step (c) to the marker sequences referred 
to in step (a) uses DNA amplification techniques and oligonucleotide primers. 

79. (New) A mutant microorganism obtained by the method of Claim 57. 

80. (New) A non-human animal or plant, or an animal or plant cell culture, containing a 
plurality of mutant microorganisms wherein each mutant contains a different marker sequence. 

81. (New) The non-human animal or plant, or an animal or plant cell culture, of Claim 
80 wherein the microorganism is a pathogenic mircroorganism. 

82. (New) A non-human animal or an animal cell culture containing a plurality of mutant 
microorganisms wherein each mutant contains a different marker sequence and wherein the 
microorganism is pathogenic to animals. 

83. (New) The non-human animal or an animal cell culture of Claim 82 wherein the 
microorganism is selected from the group consisting of Bordetella pertussis, Campylobacter 
jejuni, Clostridium botulinum, Escherichia coli, Haemophilus ducreyi, Haemophilus influenzae, 
Helicobacter pylori, Klebsiella pneumoniae, Legionella pneumophila, Listeria spp., Neisseria 
gonorrhoeae, Neisseria meningitidis, Pseudomonas spp., Salmonella spp., Shigella spp., 
Staphylococcus aureus, Streptococcus pyogenes, Streptococcus pneumoniae, Vibrio spp., and 
Yersinia pestis. 

84. (New) The non-human animal of Claim 83 which is a mouse or rat or rabbit or dog 
or monkey. 

85. (New) The non-human animal of Claim 82 which is a mouse or rat or rabbit or dog 
or monkey. 
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86. (New) A method for identifying a microorganism having a reduced adaptation to a 
particular environment comprising the steps of 

(a) providing a plurality of microorganisms wherein each microorganism contains a 
different marker sequence; 

(b) introducing the plurality of microorganisms of step (a) into the said particular 
environment and allowing those microorganisms which are able to do so to grow in the said 
environment; 

(c) retrieving microorganisms from the said environment or a selected part thereof; and 

(d) selecting an individual microorganism having a reduced capacity to proliferate in the 
particular environment by comparing any marker sequences in the nucleic acid present in the 
retrieved microorganisms in step (c) to the different marker sequences referred to in step (a). 
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IDENTIFICATION OF GENES 



The present invention relates to methods for the identification of genes 
involved in the adaptation of a microorganism to its environment, 
5 particularly the identification of genes responsible for the virulence of a 
pathogenic microorganism. 

Background to the invention 

10 Antibiotic resistance in bacterial and other pathogens is becoming 
increasingly important. It is therefore important to find new therapeutic 
approaches to attack pathogenic microorganisms. 

Pathogenic microorganisms have to evade the host's defence mechanisms 
15 and be able to grow in a poor nutritional environment to establish an 
infection. To do so a number of "virulence" genes of the microorganism 
are required. 

Virulence genes have been detected using classical genetics and a variety 
20 of approaches have been used to exploit transposon mutagenesis for the 
identification of bacterial virulence genes. For example, mutants have 
been screened for defined physiological defects, such as the loss of iron 
regulated proteins (Holland et al^ 1992), or in assays to study the 
penetration of epithelial cells (Finlay et al, 1988) and survival within 
25 macrophages (Fields et al, 1989; Miller et al, 1989a; Groisman et al, 
1989). Transposon mutants have also been tested for altered virulence in 
live animal models of infection (Miller et al, 1989b). This approach has 
the advantage that genes can be identified which are important during 
different stages of infection, but is severely limited by the need to test a 
30 wide range of mutants individually for alterations to virulence. Miller et 
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al (1989b) used groups of 8 to 10 mice and infected orally 95 separate 
groups with a different mutant thereby using between 760 and 950 mice. 
Because of the extremely large numbers of animals required, 
comprehensive screening of a bacterial genome for virulence genes has not 
5 been feasible. 

Recently a genetic system (in vivo expression technology [IVET]) was 
described which positively selects for Salmonella genes that are 
specifically induced during infection (Mahan et al, 1993). The technique 
10 will identify genes that are expressed at a particular stage in the infection 
process. However, it will not identify virulence genes that are regulated 
posttranscriptionally, and more importantly, will not provide information 
on whether the gene(s) which have been identified are actually required 
for, or contribute to, the infection process. 

15 

Lee & Falkow (1994) Methods EnzymoL 236, 531-545 describe a method 
of identifying factors influencing the invasion of Salmonella into 
mammalian cells in vitro by isolating hyperinvasive mutants. 

20 Walsh and Cepko (1992) Science 255, 434^40 describe a method of 
tracking the spatial location of cerebral cortical progenitor cells during the 
development of the cerebral cortex in the rat. The Walsh and Cepko 
method uses a tag that contains a unique nucleic acid sequence and the 
lacZ gene but there is no indication that useful mutants or genes could be 

25 detected by their method. 

WO 94/26933 and Smith et al (1995) Proc. Natl. Acad. Sci. USA 92, 
6479-6483 describe methods aimed at the identification of the functional 
regions of a known gene, or at least of a DNA molecule for which some 
30 sequence information is available. 
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Groisman et al (1993) Proc. Natl Acad. ScL USA 90, 1033-1037 
describes the molecular, functional and evolutionary analysis of sequences 
specific to Salmonella. 

5 Some virulence genes are already known for pathogenic microorganisms 
such as Escherichia coli, Salmonella typhimurium, Salmonella typhi? 
Vibrio cholerae, Clostridium botulinum, Yersinia pestis, Shigella flexneri 
and Listeria monocytogenes but in all cases only a relatively small number 
of the total have been identified. 

10 

The disease which Salmonella typhimurium causes in mice provides a good 
experimental model of typhoid fever (Carter & Collins, 1974). 
Approximately forty two genes affecting Salmonella virulence have been 
identified to date (Groisman & Ochman, 1994). These represent 
15 approximately one third of the total number of predicted virulence genes 
(Groisman and Saier, 1990). 

The object of the present invention is to identify genes involved in the 
adaptation of a microorganism to its environment, particularly to identify 
20 further virulence genes in pathogenic microorganisms, with increased 
efficiency. A further object is to reduce the number of experimental 
animals used in identifying virulence genes. Still further objects of the 
invention provide vaccines, and methods for screening for drugs which 
reduce virulence. 

25 

Summary of the invention 

A first aspect of the invention provides a method for identifying a 
microorganism having a reduced adaptation to a particular environment 
30 comprising the steps of: 
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(1) providing a plurality of microorganisms each of which is 
independently mutated by the insertional inactivation of a gene with a 
nucleic acid comprising a unique marker sequence so that each mutant 
contains a different marker sequence, or clones of the said microorganism; 
5 (2) providing individually a stored sample of each mutant 

produced by step (1) and providing individually stored nucleic acid 
comprising the unique marker sequence from each individual mutant; 

(3) introducing a plurality of mutants produced by step (1) into 
the said particular environment and allowing those microorganisms which 

10 are able to do so to grow in the said environment; 

(4) retrieving microorganisms from the said environment or a 
selected part thereof and isolating the nucleic acid from the retrieved 
microorganisms; 

(5) comparing any marker sequences in the nucleic acid isolated 
15 in step (4) to the unique marker sequence of each individual mutant stored 

as in step (2); and 

(6) selecting an individual mutant which does not contain any of 
the marker sequences as isolated in step (4). 

20 Thus, the method uses negative selection to identify microorganisms with 
reduced capacity to proliferate in the environment. 

A microorganism can live in a number of different environments and it is 
known that particular genes and their products allow the microorganism 
to adapt to a particular environment. For example, in order for a 
pathogenic microorganism, such as a pathogenic bacterium or pathogenic 
fungus, to survive in its host the product of one or more virulence genes 
is required. Thus, in a preferred embodiment of the invention a gene of 
a microorganism which allows the microorganism to adapt to a particular 
environment is a virulence gene. 



25 



30 
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Conveniently, the particular environment is a differentiated multicellular 
organism such as a plant or animal. Many bacteria and fungi are known 
to infect plants and they are able to survive within the plant and cause 
disease because of the presence of and expression from virulence genes. 
5 Suitable microorganisms when the particular environment is a plant 
include the bacteria Agrobacterium tumefaciens which forms tumours 
(galls) particularly in grape; Erwinia amylovara; Pseudomonas 
solanacearum which causes wilt in a wide range of plants; Rhizobium 
leguminosarum which causes disease in beans; Xanthomonas campestris 

10 p.v. citri which causes canker in citrus fruits; and include the fungi 
Magnaporthe grisea which causes rice blast disease; Fusarium spp. which 
cause a variety of plant diseases; Erisyphe spp.; Colletotrichum 
gloeosporiodes; Gaeumannomyces graminis which causes root and crown 
diseases in cereals and grasses; Glomus spp. , Laccaria spp. ; Leptosphaeria 

15 maculans; Phoma tracheiphila; Phytophthora spp., Pyrenophora teres; 
Verticillium alboatrum and V. dahliae; and Mycosphaerella musicola and 
M. fijiensis. As described in more detail below, when the microorganism 
is a fungus a haploid phase to its life cycle is required. 

20 Similarly, many microorganisms including bacteria, fungi, protozoa and 
trypanosomes are known to infect animals, particularly mammals including 
humans. Survival of the microorganism within the animal and the ability 
of the microorganism to cause disease is due in large part to the presence 
of and expression from virulence genes. Suitable bacteria include 

25 Bordetella spp. particularly B. pertussis, Campylobacter spp. particularly 
C. jejuni, Clostridium spp. particularly C. botulinum, Enterococcus spp. 
particularly E.faecalis, Escherichia spp. particularly E. coli, Haemophilus 
spp. particularly H. ducreyi and H. influenzae, Helicobacter spp. 
particularly H. pylori, Klebsiella spp. particularly K. pneumoniae, 

30 Legionella spp. particularly L. pneumophila, Listeria spp. particularly L. 
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monocytogenes, Mycobacterium spp. particularly M. smegmatis and M. 
tuberculosis, Neisseria spp. particularly N. gonorrhoeae and N. 
meningitidis, Pseudomonas spp., particularly Ps. aeruginosa, Salmonella 
spp., Shigella spp., Staphylococcus spp. particularly 5. aureus, 
5 Streptococcus spp. particularly 5. pyogenes and pneumoniae, Vibrio spp. 
and Yersinia spp. particularly F. pertis. All of these bacteria cause disease 
in man and also there are animal models of the disease. Thus, when these 
bacteria are used in the method of the invention, the particular 
environment is an animal which they can infect and in which they cause 

10 disease. For example, when Salmonella typhimurium is used to infect a 
mouse the mouse develops a disease which serves as a model for typhoid 
fever in man. Staphylococcus aureus causes bacteraemia and renal abscess 
formation in mice (Albus et al (1991) Infect. Immun: 59, 1008-1014) and 
endocarditis in rabbits (Perlman & Freedman (1971) Yale J. Biol. Med. 

15 44, 206-213). 

It is required that a fungus or higher eukaryotic parasite is haploid for the 
relevant parts of its life (such as growth in the environment). Preferably, 
a DNA-mediated integrative transformation system is available and, when 

20 the microorganism is a human pathogen, conveniently an animal model of 
the human disease is available. Suitable fungi pathogenic to humans 
include certain Aspergillus spp. (for example A. fumigatus), Cryptococcus 
neoformans and Histoplasma capsulatum. Clearly the above-mentioned 
fungi have a haploid phase and a DNA-mediated integrative transformation 

25 systems are available for them. Toxoplasma may also be used, being a 
parasite with a haploid phase during infection. Bacteria have a haploid 
genome. 

Animal models of human disease are often available in which the animal 
30 is a mouse, rat, rabbit, dog or monkey. It is preferred if the animal is a 
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mouse. Virulence genes detected by the method of the invention using an 
animal model of a human disease are clearly very likely to be genes which 
determine the virulence of the microorganism in man. 

5 Particularly preferred microorganisms for use in the methods of the 
invention are Salmonella typhimurium, Staphylococcus aureus, 
Streptococcus pneumoniae, Enterococcus faecalis, Pseudomonas 
aeruginosa and Aspergillus Jumigatus. 

10 A preferred embodiment of the invention is now described. 

A nucleic acid comprising a unique marker sequence is made as follows. 
A complex pool of double stranded DNA sequence "tags" is generated 
using oligonucleotide synthesis and a polymerase chain reaction (PCR). 

15 Each DNA "tag" has a unique sequence of between about 20 and 80 bp, 
preferably about 40 bp which is flanked by "arms" of about 15 to 30 bp, 
preferably about 20 bp, which are common to all "tags". The number of 
bp in the unique sequence is sufficient to allow large numbers (for 
example > 10 10 ) of unique sequences to be generated by random 

20 oligonucleotide synthesis but not too large to allow the formation of 
secondary structures which may interfere with a PCR. Similarly, the 
length of the arms should be sufficient to allow efficient priming of 
oligonucleotides in a PCR. 

25 It is well known that the sequence at the 5' end of the oligonucleotide 
need not match the target sequence to be amplified. 

It is usual that the PCR primers do not contain any complementary 
structures with each other longer than 2 bases, especially at their 3' ends, 
30 as this feature may promote the formation of an artifactual product called 
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"primer dimer". When the 3' ends of the two primers hybridize, they 
form a "primed template 75 complex, and primer extension results in a short 
duplex product called "primer dimer*. 

5 Internal secondary structure should be avoided in primers. For symmetric 
PCR, a 40-60% G+C content is often recommended for both primers, 
with no long stretches of any one base. The classical melting temperature 
calculations used in conjunction with DNA probe hybridization studies 
often predict that a given primer should anneal at a specific temperature 

10 or that the 72°C extension temperature will dissociate the primer/template 
hybrid prematurely. In practice, the hybrids are more effective in the 
PCR process than generally predicted by simple T m calculations. 

Optimum annealing temperatures may be determined empirically and may 
15 be higher than predicted. Taq DNA polymerase does have activity in the 
37-55 °C region, so primer extension will occur during the annealing step 
and the hybrid will be stabilized. The concentrations of the primers are 
equal in conventional (symmetric) PCR and, typically, within 0.1- to 1- 
/xM range. 

20 

The "tags* 5 are ligated into a transposon or transposon-like element to 
form the nucleic acid comprising a unique marker sequence. 
Conveniently, the transposon is carried on a suicide vector which is 
maintained as a plasmid in a "helper" organism, but which is lost after 

25 transfer to the microorganism of the method of the invention. For 
example, the "helper 75 organism may be a strain of Escherichia coli, the 
microorganism of the method may be Salmonella and the transfer is a 
conjugal transfer. Although the transposon can be lost after transfer, in 
a proportion of cells it undergoes a transposition event through which it 

30 integrates at random, along with its unique tag, into the genome of the 
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microorganism used in the method. It is most preferred if the transposon 
or transposon-like element can be selected. For example, in the case of 
Salmonella, a kanamycin resistance gene may be present in the transposon 
and exconjugants are selected on medium containing kanamycin. It is also 
5 possible to complement an auxotrophic marker in the recipient cell with 
a functional gene in the nucleic acid comprising the unique marker. This 
method is particularly convenient when fungi are used in the method. 

Preferably the complementing functional gene is not derived from the 
10 same species as the recipient microorganism, otherwise non-random 
integration events may occur. 

It is also particularly convenient if the transposon or transposon-like 
element is carried on a vector which is maintained episomaliy (ie not as 

15 part of the chromosome) in the microorganism used in the method of the 
first aspect of the invention when in a first given condition whereas, upon 
changing that condition to a second given condition, the episome cannot 
be maintained permitting the selection of a cell in which the transposon or 
transposon-like element has undergone a transposition event through which 

20 it integrates at random, along with its unique tag, into the genome 

of the microorganism used in the method. This particularly convenient 
embodiment is advantageous because, once a microorganism carrying the 
episomal vector is made, then each time the transposition event is selected 
for or induced by changing the condition of the microorganism (or a clone 

25 thereof) from a first given condition to a second given condition, the 
transposon can integrate at a different site in the genome of the 
microorganism. Thus, once a master collection of microorganisms are 
made, each member of which contains a unique tag sequence in the 
transposon or transposon-like element carried on the episomal vector 

30 (when in the first given condition), it can be used repeatedly to generate 



pools of random insertional mutants, each of which contains a different tag 
sequence (ie unique within the pool). This embodiment is particularly 
useful because (a) it reduces the number and complexity of manipulations 
required to generate the plurality ("pool") of independently mutated 
5 microorganisms in step (1) of the method; and (b) the number of different 
tags need only be the same as the number of microorganisms in the 
plurality of microorganisms in step (1) of the method. Point (a) makes the 
method easier to use in organisms for which transposon mutagenesis is 
more difficult to perform (for example, Staphylococcus aureus) and point 

10 (b) means that tag sequences with particularly good hybridisation 
characteristics can be selected therefore making quality control easier. As 
is described in more detail below, the "pool" size is conveniently about 
100 or 200 independently-mutated microorganisms and, therefore the 
master collection of microorganisms is conveniently stored in one or two 

15 96-well microtitre plates. 

In a particularly preferred embodiment the first given condition is a first 
particular temperature or temperature range such as 25°C to 32°C, most 
preferably about 30°C and the second given condition is a second 

20 particular temperature or temperature range such as 35°C to 45°C, most 
preferably 42°C. In further preferred embodiments the first given 
condition is the presence of an antibiotic, such as streptomycin, and the 
second given condition is the absence of the said antibiotic; or the first 
given condition is the absence of an antibiotic and the second given 

25 condition is the presence of the said antibiotic. 

Transposons suitable for integration into the genome of Gram negative 
bacteria include Tn5, TnlO and derivatives thereof. Transposons suitable 
for integration into the genome of Gram positive bacteria include Tn916 
30 and derivatives or analogues thereof. Transposons particularly suited for 
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use with Staphylococcus aureus include Tn917 (Cheung et al (1992) Proc. 
Natl Acad. ScL USA 89, 6462-6466) and Tn918 (Albus et al (1991) 
Infect. Immun. 59, 1008-1014). 

5 It is particularly preferred if the transposons have the properties of the 
Tn917 derivatives described by Camiiii et al (1990) J. Bacteriol 172, 
3738-3744, and are carried by a temperature-sensitive vector such as 
pE194Ts (Viilafane et al (1987) J m Bacteriol 169, 4822-4829). 

10 It will be appreciated that although transposons are convenient for 
insertionally inactivating a gene, any other known method, or method 
developed in the future may be used. A further convenient method of 
insertionally inactivating a gene, particularly in certain bacteria such as 
Streptococcus , is using insertion-duplication mutagenesis such as that 

15 described in Morrison et al (1984) J. Bacteriol 159, 870 with respect to 
S.pneumoniae. The general method may also be applied to other 
microorganisms, especially bacteria. 

For fiingi, insertional mutations are created by transformation using DNA 
20 fragments or plasmids carrying the "tags" and, preferably, selectable 
markers encoding, for example, resistance to hygromycin B or phleomycin 
(see Smith et al (1994) Infect. Immunol 62, 5247-5254). Random, single 
integration of DNA fragments encoding hygromycin B resistance into the 
genome of filamentous fungi, using restriction enzyme mediated 
25 integration (REMI; Schiestl & Petes (1991); Lu et al (1994) Proc. Natl 
Acad. Sci. USA 91, 12649-12653) are known. 

A simple insertional mutagenesis technique for a fungus is described in 
Schiestl & Petes (1994) incorporated herein by reference, and include, for 
30 example, the use of Ty elements and ribosomal DNA in yeast. 
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Random integration of the transposon or other DNA sequence allows 
isolation of a plurality of independently mutated microorganisms wherein 
a different gene is insertionally inactivated in each mutant and each mutant 
contains a different marker sequence. 

5 

A library of such insertion mutants is arrayed in welled microtitre dishes 
so that each well contains a different mutant microorganism. DNA 
comprising the unique marker sequence from each individual mutant 
microorganism (conveniently, the total DNA from the clone is used) is 

10 stored. Conveniently, this is done by removing a sample of the 
microorganism from the microtitre dish, spotting it onto a nucleic acid 
hybridisation membrane (such as nitrocellulose or nylon membranes), 
lysing the microorganism in alkali and fixing the nucleic acid to the 
membrane. Thus, a replica of the contents of the welled microtitre dishes 

15 is made. 

• Pools of the microorganisms from the welled microtitre dish are made and 
DNA is extracted. This DNA is used as a target for a PCR using primers 
that anneal to the common "arms" flanking the "tags" and the amplified 

20 DNA is labelled, for example with ^P. The product of the PCR is used 
to probe the DNA stored from each individual mutant to provide a 
reference hybridisation pattern for the replicas of the welled microtitre 
dishes. This is a check that each of the individual microorganisms does, 
in fact, contain a marker sequence and that the marker sequence can be 

25 amplified and labelled efficiently. 

Pools of transposon mutants are made to introduce into the particular 
environment. Conveniently, 96-well microtitre dishes are used and the 
pool contains 96 transposon mutants. However, the lower limit for the 
30 pool is two mutants; there is no theoretical upper limit to the size of the 
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pool but, as discussed below, the upper limit may be determined in 
relation to the environment into which the mutants are introduced. 

Once the microorganisms are introduced into the said particular 
environment those microorganisms which are able to do so are allowed to 
grow in the environment. The length of time the microorganisms are left 
in the environment is determined by the nature of the microorganism and 
the environment. After a suitable length of time, the microorganisms are 
recovered from the environment, DNA is extracted and the DNA is used 
as a template for a PCR using primers that anneal to the "arms" flanking 
the "tags". The PCR product is labelled, for example with ^P, and is 
used to probe the DNA stored from each individual mutant replicated from 
the welled microtitre dish. Stored DNA are identified which hybridise 
weakly or not at all with the probe generated from the DNA isolated from 
the microorganisms recovered from environment. These non-hybridising 
DNAs correspond to mutants whose adaptation to the particular 
environment has been attenuated by insertion of the transposon or other 
DNA sequence. 

In a particularly preferred embodiment the "arms" have no, or very little, 
label compared to the "tags" . For example, the PCR primers are suitably 
designed to contain no, or a single, G residue, the 32 P-labelied nucleotide 
is dCTP and, in this case, no or one radiolabeled C residue is 
incorporated in each "arm" but a greater number of radiolabeled C 
residues are incorporated in the "tag". It is preferred if the "tag" has at 
least ten-fold more label incorporated than the "arms"; preferably twenty- 
fold or more; more preferably fifty-fold or more. Conveniently the 
"arms" can be removed from the "tag" using a suitable restriction 
enzyme, a site for which may be incorporated in the primer design. 
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As discussed above, a particularly preferred embodiment of the invention 
is when the microorganism is a pathogenic microorganism and the 
particular environment is an animal. In this embodiment, the size of the 
pool of mutants introduced into the animal is determined by (a) the 

5 number of cells of each mutant that are likely to survive in the animal 
(assuming a virulence gene has not been inactivated) and (b) the total 
inoculum of the microorganism. If the number in (a) is too low then false 
positive results may arise and if the number in (b) is too high then the 
animal may die before enough mutants have had a chance to grow in the 

10 desired way. The number of cells in (a) can be determined for each 
microorganism used but it is preferably more than 50, more preferably 
more than 100. 

The number of different mutants that can be introduced into a single 
15 animal is preferably between 50 and 500, conveniently about 100. It is 
preferred if the total inoculum does not exceed 10 6 cells (and it is 
preferably 10 5 cells) although the size of the inoculum may be varied 
above or below this amount depending on the microorganism and the 
animal. 

20 

In a particularly convenient method an inoculum of 10 5 is used containing 
1000 ceils each of 100 different mutants for a single animal. It will be 
appreciated that in this method one animal can be used to screen 100 
mutants compared to prior art methods which require at least 100 animals 
25 to screen 100 mutants. 

However, it is convenient to inoculate three animals with the same pool 
of mutants so that at least two can be investigated (one as a replica to 
check the reliability of the method), whilst the third is kept as a back-up. 
30 Nevertheless, the method still provides a greater than 30-fold saving in the 
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number of animals used. 

The time between the pool of mutants being introduced into the animal 
and the microorganisms being recovered may vary with the microorganism 
5 and animal used. For example, when the animal is a mouse and the 
microorganism is Salmonella typhimuriurn, the time between inoculation 
and recovery is about three days. 

In one embodiment of the invention microorganisms are retrieved from the 
10 environment in step (5) at a site remote from the site of introduction in 
step (4), so that the virulence genes being investigated include those 
involved in the spread of the microorganism between the two sites. 

For example, in a plant the microorganism may be introduced in a lesion 
15 in the stem or at one site on a leaf and the microorganism retrieved from 
another site on the leaf where a disease state is indicated. 

In the case of an animal, the microorganism may be introduced orally, 
intraperitoneally, intravenously or intranasally and is retrieved at a later 
20 time from an internal organ such as the spleen. It may be useful to 
compare the virulence genes identified by oral administration and those 
identified by intraperitoneal administration as some genes may be required 
to establish infection by one route but not by the other. It is preferred if 
Salmonella is introduced intraperitoneally. 

25 

Other preferred environments which may be used to identify virulence 
genes are animal cells in culture (particularly macrophages and epithelial 
cells) and plant cells in culture. Although using cells in culture will be 
useful in its own right, it will also complement the use of the whole 
30 animal or plant, as the case may be, as the environment. 
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It is also preferred if the environment is a part of the animal body. 
Within a given host-parasite interaction, a number of different 
environments are possible, including different organs and tissues, and 
parts thereof such as the Peyer's patch. 

5 

The number of individual microorganisms (ie cells) recovered from the 
environment should be at least twice, preferably at least ten times, more 
preferably 100-times the number of different mutants introduced into the 
environment. For example, when an animal is inoculated with 100 
10 different mutants around 10,000 individual microorganisms should be 
retrieved and their marker DNA isolated. 

A further preferred embodiment comprises the steps: 

15 (1 A) removing auxotrophs from the plurality of mutants produced in step 
(1); or 

(6A) determining whether the mutant selected in step (6) is an auxotroph; 
or 

20 

both (1A) and (6A). 

It is desirable to distinguish an auxotroph (that is a mutant microorganism 
which requires growth factors not needed by the wild type or by 
25 prototrophs) and a mutant microorganism wherein a gene allowing the 
microorganism to adapt to a particular environment is inactivated. 
Conveniently, this is done between steps (1) and (2) or after step (6). 

Preferably auxotrophs are not removed when virulence genes are being 
30 identified. 



17 

A second aspect of the invention provides a method of identifying a gene 
which allows a microorganism to adapt to a particular environment, the 
method comprising the method of the first aspect of the invention, 
followed by the additional step: 

5 

(7) isolating the insertionally-inactivated gene or part thereof from the 
individual mutant selected in step (6). 

Methods for isolating a gene containing a unique marker are well known 
10 in the art of molecular biology, 

A further preferred embodiment comprises the further additional step: 

(8) isolating from a wild-type microorganism the corresponding wild- 
15 type gene using the insertionally-inactivated gene isolated in step (7) or 

part thereof as a probe. 

Methods for gene probing are well known in the art of molecular biology. 

20 Molecular biological methods suitable for use in the practice of the present 
invention are disclosed in Sambrook et al (1989) incorporated herein by 
reference. 

When the microorganism is a microorganism pathogenic to an animal and 
25 the gene is a virulence gene and a transposon has been used to 
insertionally inactivate the gene, it is convenient for the virulence genes 
to be cloned by digesting genomic DNA from the individual mutant 
selected in step (6) with a restriction enzyme which cuts outside the 
transposon, ligating size-fractionated DNA containing the transposon into 
30 a plasmid, and selecting plasmid recombinants on the basis of antibiotic 
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resistance conferred by the transposon and not by the plasmid. The 
microorganism genomic DNA adjacent to the transposon is sequenced 
using two primers which anneal to the terminal regions of the transposon, 
and two primers which anneal close to the polylinker sequences of the 
5 plasmid. The sequences may be subjected to DNA database searches to 
determine if the transposon has interrupted a known virulence gene. 
Thus, conveniently, sequence obtained by this method is compared against 
the sequences present in the publicly available databases such as EMBL 
and GenBank. Finally, if the interrupted sequence appears to be in a new 
10 virulence gene, the mutation is transferred to a new genetic background 
(for example by phage P22-mediated transduction in the case of 
Salmonella) and the LD^ of the mutant strain is determined to confirm 
that the avirulent phenotype is due to the transposition event and not a 
secondary mutation. 

15 

The number of individual mutants screened in order to detect all of the 
virulence genes in a microorganism depends on the number of genes in the 
genome of the microorganism. For example, it is likely that 3000-5000 
mutants of Salmonella typhimurium need to be screened in order to detect 

20 the majority of virulence genes whereas for Aspergillus spp., which has 
a larger genome than Salmonella, around 20 000 mutants are screened. 
Approximately 4% of non-essential S. typhimurium genes are thought to 
be required for virulence (Grossman & Saier, 1990) and, if so, the S. 
typhimurium genome contains approximately 150 virulence genes. 

25 However, the methods of the invention provide a faster, more convenient 
and much more practicable route to identifying virulence genes. 



A third aspect of the invention provides a microorganism obtained using 
the method of the first aspect of the invention. 

30 
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Such microorganisms are useful because they have the property of not 
being adapted to survive in a particular environment. 

In a preferred embodiment, a pathogenic microorganism is not adapted to 
5 survive in a host organism (environment) and, in the case of 
microorganisms that are pathogenic to animals, particularly mammals, 
more particularly humans, the mutant obtained by the method of the 
invention may be used in a vaccine. The mutant is avirulent, and 
therefore expected to be suitable for administration to a patient, but it is 
10 expected to be antigenic and give rise to a protective immune response. 

In a further preferred embodiment the pathogenic microorganism not 
adapted to survive in a host organism, obtained by the methods of the 
invention, is modified, preferably by the introduction of a suitable DNA 
15 sequence, to express an antigenic epitope from another pathogen- This 
modified microorganism can act as a vaccine for that other pathogen. 

A fourth aspect of the invention provides a microorganism comprising a 
mutation in a gene identified using the method of the second aspect of the 
20 invention. 

Thus, although the microorganism of the third aspect of the invention is 
useful, it is preferred if a mutation is specifically introduced into the 
identified gene. In a preferred embodiment, particularly when the 

25 microorganism is to be used in a vaccine, the mutation in the gene is a 
deletion or a frameshift mutation or any other mutation which is 
substantially incapable of reverting. Such gene-specific mutations can be 
made using standard procedures such as introducing into the 
microorganism a copy of the mutant gene on an autonomous replicon 

30 (such as a plasmid or viral genome) and relying on homologous 
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recombination to introduce the mutation into the copy of the gene in the 
genome of the microorganism. 



Fifth and sixth aspects of the invention provide a suitable microorganism 
5 for use in a vaccine and a vaccine comprising a suitable microorganism 
and a pharmaceutically-acceptable carrier. 

The suitable microorganism is the aforementioned avirulent mutant. 

10 Active immunisation of the patient is preferred. In this approach, one or 
more mutant microorganisms are prepared in an immunogenic formulation 
containing suitable adjuvants and carriers and administered to the patient 
in known ways. Suitable adjuvants include Freund's complete or 
incomplete adjuvant, muramyl dipeptide, the "Iscorns" of EP 109 942, EP 

15 180 564 and EP 231 039, aluminium hydroxide, saponin, DEAE-dextran, 
neutral oils (such as miglyol), vegetable oils (such as arachis oil), 
liposomes, Pluronic polyols or the Ribi adjuvant system (see, for example 
GB-A-2 189 141). "Pluronic" is a Registered Trade Mark. The patient 
to be immunised is a patient requiring to be protected from the disease 

20 caused by the virulent form of the microorganism. 

The aforementioned avirulent microorganisms of the invention or a 
formulation thereof may be administered by any conventional method 
including oral and parenteral (eg subcutaneous or intramuscular) injection. 
25 The treatment may consist of a single dose or a plurality of doses over a 
period of time. 



30 



Whilst it is possible for an avirulent microorganism of the invention to be 
administered alone, it is preferable to present it as a pharmaceutical 
formulation, together with one or more acceptable carriers. The carrier(s) 



21 

must be "acceptable" in the sense of being compatible with the avirulent 
microorganism of the invention and not deleterious to the recipients 
thereof. Typically, the carriers will be water or saline which will be 
sterile and pyrogen free. 

5 

It will be appreciated that the vaccine of the invention, depending on its 
microorganism component, may be useful in the fields of human medicine 
and veterinary medicine. 

10 Diseases caused by microorganisms are known in many animals, such as 
domestic animals. The vaccines of the invention, when containing an 
appropriate avirulent microorganism, particularly avirulent bacterium, are 
useful in man but also in, for example, cows, sheep, pigs, horses, dogs 
and cats, and in poultry such as chickens, turkeys, ducks and geese. 

15 

Seventh and eighth aspects of the invention provide a gene obtained by the 
method of the second aspect of the invention, and a polypeptide encoded 
thereby. 

20 By "gene" we include not only the regions of DNA that code for a 
polypeptide but also regulatory regions of DNA such as regions of DNA 
that regulate transcription, translation and, for some microorganisms, 
splicing of RNA. Thus, the gene includes promoters, transcription 
terminators, ribosome-binding sequences and for some organisms introns 

25 and splice recognition sites. 

Typically, sequence information of the inactivated gene obtained in step 
7 is derived. Conveniently, sequences close to the ends of the transposon 
are used as the hybridisation site of a sequencing primer. The derived 
30 sequence or a DNA restriction fragment adjacent to the inactivated gene 
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itself is used to make a hybridisation probe with which to identify, and 
isolate from a wild-type organism, the corresponding wild type gene. 



It is preferred if the hybridisation probing is done under stringent 
5 conditions to ensure that the gene, and not a relative, is obtained. By 
"stringent" we mean that the gene hybridises to the probe when the gene 
is immobilised on a membrane and the probe (which, in this case is > 200 
nucleotides in length) is in solution and the immobilised gene/hybridised 
probe is washed in 0.1 x SSC at 65°C for 10 min. SSC is 0.15 M 
10 NaCl/0.015 M Na citrate. 

Preferred probe sequences for cloning Salmonella virulence genes are 
shown in Figures 5 and 6 and described in Example 2. 

15 In a particularly preferred embodiment the Salmonella virulence genes 
comprise the sequence shown in Figures 5 and 6 and described in 
Example 2. 

In further preference the genes are those contained within, or at least part 
20 of which is contained within, the sequences shown in Figures 11 and 12 
and which have been identified by the method of the second aspect of the 
invention. The sequences shown in Figures 1 1 and 12 are part of a gene 
cluster from Salmonella typhimurium which I have called virulence gene 
cluster 2 (VGC2). The position of transposon insertions are indicated 
25 within the sequence, and these transposon insertions inactivate a virulence 
determinant of the organism. As is discussed more frilly below, and "in 
particular in Example 4, when the method of the second aspect of the 
invention is used to identify virulence genes in Salmonella typhimurium, 
many of the nucleic acid insertions (and therefore genes identified) are 
30 clustered in a relatively small part of the genome. This region, VGC2, 
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contains other virulence genes which, as described below, form part of the 
invention. 

The gene isolated by the method of the invention can be expressed in a 
5 suitable host cell. Thus, the gene (DNA) may be used in accordance with 
known techniques, appropriately modified in view of the teachings 
contained herein, to construct an expression vector, which is then used to 
transform an appropriate host cell for the expression and production of the 
polypeptide of the invention. Such techniques include those disclosed in 
10 US Patent Nos. 4,440,859 issued 3 April 1984 to Rutter et al, 4,530,901 
issued 23 My 1985 to Weissman, 4,582,800 issued 15 April 1986 to 
Crowl, 4,677,063 issued 30 June 1987 to Mark et al, 4,678,751 issued 7 
July 1987 to Goeddei, 4,704,362 issued 3 November 1987 to Itakura etal, 
4,710,463 issued 1 December 1987 to Murray, 4,757,006 issued 12 July 
15 1988 to Toole, Jr, et al, 4,766,075 issued 23 August 1988 to Goeddei et 
al and 4,810,648 issued 7 March 1989 to Stalker, all of which are 
incorporated herein by reference. 

The DNA encoding the polypeptide constituting the compound of the 
20 invention may be joined to a wide variety of other DNA sequences for 
introduction into an appropriate host. The companion DNA will depend 
upon the nature of the host, the manner of the introduction of the DNA 
into the host, and whether episomal maintenance or integration is desired. 

25 Generally, the DNA is inserted into an expression vector, such as a 
piasmid, in proper orientation and correct reading frame for expression. 
If necessary, the DNA may be linked to the appropriate transcriptional and 
translational regulatory control nucleotide sequences recognised by the 
desired host, although such controls are generally available in the 

30 expression vector. The vector is then introduced into the host through 
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standard techniques. Generally, not all of the hosts will be transformed 
by the vector. Therefore, it will be necessary to select for transformed 
host cells. One selection technique involves incorporating into the 
expression vector a DNA sequence, with any necessary control elements, 
5 that codes for a selectable trait in the transformed cell, such as antibiotic 
resistance. Alternatively, the gene for such selectable trait can be on 
another vector, which is used to co-transform the desired host cell. 

Host cells that have been transformed by the recombinant DNA of the 
10 invention are then cultured for a sufficient time and under appropriate 
conditions known to those skilled in the art in view of the teachings 
disclosed herein to permit the expression of the polypeptide, which can 
then be recovered. 

15 Many expression systems are known, including bacteria (for example E, 
coli and Bacillus subtilis), yeasts (for example Saccharotnyces cerevisiae), 
filamentous fungi (for example Aspergillus), plant cells, animal-cells and 
insect cells. 

20 The vectors include a prokaryotic replicon, such as the ColEl ori, for 
propagation in a prokaryote, even if the vector is to be used for expression 
in other, non-prokaryotic, cell types. The vectors can also include an 
appropriate promoter such as a prokaryotic promoter capable of directing 
the expression (transcription and translation) of the genes in a bacterial 

25 host cell, such as E. coli, transformed therewith. 

A promoter is an expression control element formed by a DNA sequence 
that permits binding of RNA polymerase and transcription to occur. 
Promoter sequences compatible with exemplary bacterial hosts are 
30 typically provided in plasmid vectors containing convenient restriction sites 
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for insertion of a DNA segment of the present invention. 

Typical prokaryotic vector plasmids are pUC18, pUC19, pBR322 and 
pBR329 available from Biorad Laboratories, (Richmond, CA, USA) and 
5 pTrc99A and pKK223-3 available from Pharmacia, Piscataway, NJ, USA. 

A typical mammalian cell vector plasmid is pSVL available from 
Pharmacia, Piscataway, NJ, USA. This vector uses the SV40 late 
promoter to drive expression of cloned genes, the highest level of 
10 expression being found in T antigen-producing cells, such as COS-1 cells. 

An example of an inducible mammalian expression vector is pMSG, also 
available from Pharmacia. This vector uses the glucocorticoid-inducible 
promoter of the mouse mammary tumour virus long terminal repeat to 
15 drive expression of the cloned gene. 

Useful yeast plasmid vectors are pRS403-406 and pRS413-416 and are 
generally available from Stratagene Cloning Systems, La Jolla, CA 92037, 
USA. Plasmids pRS403, pRS404, pRS405 and pRS406 are Yeast 
20 Integrating plasmids (Yips) and incorporate the yeast selectable markers 
HIS3, TRP1, LEU2 and URA3. Plasmids pRS413-416 are Yeast 
Centromere plasmids (YCps) 

A variety of methods have been developed to operably link DNA to 
25 vectors via complementary cohesive termini. For instance, 
complementary homopolymer tracts can be added to the DNA segment "to 
be inserted to the vector DNA. The vector and DNA segment are then 
joined by hydrogen bonding between the complementary homopolymeric 
tails to form recombinant DNA molecules. 

30 
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Synthetic linkers containing one or more restriction sites provide an 
alternative method of joining the DNA segment to vectors. The DNA 
segment, generated by endonuclease restriction digestion as described 
earlier, is treated with bacteriophage T4 DNA polymerase or E. coli DNA 
polymerase I, enzymes that remove protruding, 3' -single-stranded termini 
with their S'-S'-exonucleoIytic activities, and fill in recessed 3 '-ends with 
their polymerizing activities. 

The combination of these activities therefore generates blunt-ended DNA 
segments. The blunt-ended segments are then incubated with a large 
molar excess of linker molecules in the presence of an enzyme that is able 
to catalyze the ligation of . blunt-ended DNA molecules, such as 
bacteriophage T4 DNA ligase. Thus, the products of the reaction are 
DNA segments carrying polymeric linker sequences at their ends. These 
DNA segments are then cleaved with the appropriate restriction enzyme 
and ligated to an expression vector that has been cleaved with an enzyme 
that produces termini compatible with those of the DNA segment. 

Synthetic linkers containing a variety of restriction endonuclease sites are 
commercially available from a number of sources including International 
Biotechnologies Inc, New Haven, CN, USA. 

A desirable way to modify the DNA encoding the polypeptide of the 
invention is to use the polymerase chain reaction as disclosed by Saiki et 
al (1988) Science 239, 487-491. 

In this method the DNA to be enzymatically amplified is flanked by two 
specific oligonucleotide primers which themselves become incorporated 
into the amplified DNA. The said specific primers may contain restriction 
endonuclease recognition sites which can be used for cloning into 
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expression vectors using methods known in the art 



Variants of the genes also form part of the invention. It is preferred if the 
variant has at least 70% sequence identity, more preferably at least 85% 
5 sequence identity, most preferably at least 95 % sequence identity with the 
genes isolated by the method of the invention. Of course, replacements, 
deletions and insertions may be tolerated. The degree of similarity 
between one nucleic acid sequence and another can be determined using 
the GAP program of the University of Wisconsin Computer Group. 

10 

Similarly, variants of proteins encoded by the genes are included. 

By "variants" we include insertions, deletions and substitutions, either 
conservative or non-conservative, where such changes do not substantially 
15 alter the normal function of the protein. 

By "conservative substitutions" is intended combinations such as Gly, Ala; 
Val, He, Leu; Asp, Glu; Asn, Gin; Ser, Thr; Lys, Arg; and Phe, Tyr. 

20 Such variants may be made using the well known methods of protein 
engineering and site-directed mutagenesis. 

A ninth aspect of the invention provides a method of identifying a 
compound which reduces the ability of a microorganism to adapt to a 
25 particular environment comprising the steps of selecting a compound 
which interferes with the function of (1) a gene obtained by the method of 
the second aspect of the invention or of (2) a polypeptide encoded by such 
a gene. 

30 Pairwise screens for compounds which affect the wild type cell but not a 
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cell overproducing a gene isolated by the methods of the invention form 
part of this aspect of the invention. 

For example, in one embodiment one cell is a wild type cell and a second 
cell is the Salmonella which is made to overexpress the gene isolated by 
the method of the invention. The viability and/or growth of each cell in 
the particular environment is determined in the presence of a compound 
to be tested to identify which compound reduces the viability or growth 
of the wild type cell but not the cell overexpressing the said gene. 

It is preferred if the gene is a virulence gene. 

For example, in one embodiment the microorganism (such as 51 
typhimurium) is made to over-express the virulence gene identified by the 
method of the first aspect of the invention. Each of (a) the "over- 
expressing" microorganism and (b) an equivalent microorganism (which 
does not over-express the virulence gene) are used to infect cells in 
culture. Whether a particular test compound will selectively inhibit the 
virulence gene function is determined by assessing the amount of the test 
compound which is required to prevent infection of the host cells by (a) 
the over-expressing microorganism and (b) the equivalent microorganism 
(at least for some virulence gene products it is envisaged that the test 
compound will inactivate them, and itself be inactivated, by binding to the 
virulence gene product). If more of the compound is required to prevent 
infection by the (a) than (b) then this suggests that the compound is 
selective. It is preferred if the microorganisms (such as Salmonella) are 
destroyed extracellularly by a mild antibiotic such as gentamicin (which 
does not penetrate host cells) and that the effect of the test compound in 
preventing infection of the cell by the microorganism is by lysing the said 
cell and determining how many microorganisms are present (for example 
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by plating on agar). 

Pairwise screens and other screens for compounds are generally disclosed 
in Kirsch & Di Domenico (1993) in a The Discovery of Natural Products 
5 with a Therapeutic Potential" (Ed, V JP. Gallo), Chapter 6, pages 177-221 , 
Butterworths, V.K. (incorporated herein by reference). 

Pairwise screens can be designed in a number of related formats in which 
the relative sensitivity to a compound is compared using two genetically 

10 related strains. If the strains differ at a single locus, then a compound 
specific for that target can be identified by comparing each strain's 
sensitivity to the inhibitor. For example, inhibitors specific to the target 
will be more active against a super-sensitive test strain when compared to 
an otherwise isogenic sister strain. In an agar diffusion format, this is 

15 determined by measuring the size of the zone of inhibition surrounding the 
disc or well carrying the compound. Because of diffusion, a continuous 
concentration gradient of compound is set up, and the strain's sensitivity 
to inhibitors is proportional to the distance from the disc or well to the 
edge of the zone. General antimicrobials, or antimicrobials with modes 

20 of action other than the desired one are generally observed as having 
similar activities against the two strains. 

Another type of molecular genetic screen, involving pairs of strains where 
a cloned gene product is overexpressed in one strain compared to a control 

25 strain. The rationale behind this type of assay is that the strain containing 
an elevated quantity of the target protein should be more resistant to 
inhibitors specific to the cloned gene product than an isogenic strain, 
containing normal amounts of the target protein. In an agar diffusion 
assay, the zone size surrounding a specific compound is expected to be 

30 smaller in the strain overexpressing the target protein compared to an 



otherwise isogenic strain. 
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Additionally or alternatively selection of a compound is achieved in the 
following steps: 

5 

L A mutant microorganism obtained using the method of the first 
aspect of the invention is used as a control (it has a given phenotype, for 
example, avirulence). 

10 2. A compound to be tested is mixed with the wild-type 
microorganism. 

3. The wild-type microorganism is introduced into the environment 
(with or without the test compound). 

15 

4. If the wild-type microorganism is unable to adapt to the 
environment (following treatment by, or in the presence of, the 
compound), the compound is one which reduces the ability of the 
microorganism to adapt to, or survive in, the particular environment. 

20 

When the environment is an animal body and the microorganism is a 
pathogenic microorganism, the compound identified by this method can be 
used in a medicament to prevent or ameliorate infection with the 
microorganism. 

25 

A tenth aspect of the invention therefore provides a compound identifiable 
by the method of the ninth aspect. 

It will be appreciated that uses of the compound of the tenth aspect are 
30 related to the method by which it can be identified, and in particular in 
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relation to the host of a pathogenic microorganism. For example, if the 
compound is identifiable by a method which uses a virulence gene, or 
polypeptide encoded thereby, from a bacterium which infects a mammal, 
the compound may be useful in treating infection of a mammal by that 
5 bacterium. 

Similarly, if the compound is identifiable by a method which uses a 
virulence gene, or polypeptide encoded thereby, from a fungus which 
infects a plant, the compound may be useful in treating infection of a plant 
10 by that fungus. 

An eleventh aspect of the invention provides a molecule which selectively 
interacts with, and substantially inhibits the function of, a gene of the 
seventh aspect of the invention or a nucleic acid product thereof. 

15 

By "nucleic acid product thereof we include any RNA, especially 
• mRNA, transcribed from the gene. 

Preferably a molecule which selectively interacts with, and substantially 
20 inhibits the function of, said gene or said nucleic acid product is an 
antisense nucleic acid or nucleic acid derivative. 

More preferably, said molecule is an antisense oligonucleotide. 

25 Antisense oligonucleotides are single-stranded nucleic acid, which can 
specifically bind to a complementary nucleic acid sequence. By binding 
to the appropriate target sequence, an RNA-RNA, a DNA-DNA, or RNA- 
DNA duplex is formed. These nucleic acids are often termed "antisense" 
because they are complementary to the sense or coding strand of the gene. 

30 Recently, formation of a triple helix has proven possible where the 
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oligonucleotide is bound to a DNA duplex. It was found that 
oligonucleotides could recognise sequences in the major groove of the 
DNA double helix. A triple helix was formed thereby. This suggests that 
it is possible to synthesise sequence-specific molecules which specifically 
bind double-stranded DNA via recognition of major groove hydrogen 
binding sites. 

Clearly, the sequence of the antisense nucleic acid or oligonucleotide can 
readily be determined by reference to the nucleotide sequence of the gene 
in question. For example, antisense nucleic acid or oligonucleotides can 
be designed which are complementary to a part of the sequence shown in 
Figures 1 1 or 12, especially to sequences which form a part of a virulence 
gene. 

Oligonucleotides are subject to being degraded or inactivated by cellular 
endogenous nucleases. To counter this problem, it is possible to use 
modified oligonucleotides, eg having altered internucleotide linkages, in 
which the naturally occurring phosphodiester linkages have been replaced 
with another linkage. For example, Agrawal et al (1988) Proa Natl 
Acad. ScL USA 85, 7079-7083 showed increased inhibition in tissue 
culture of HIV-1 using oligonucleotide phosphoramidates and 
phosphorothioates. Sarin et al (1988) Proa Natl Acad. ScL USA 85, 
7448-7451 demonstrated increased inhibition of HIV-1 using 
oligonucleotide methylphosphonates. Agrawal et al (1989) Proc. Natl 
Acad. ScL USA 86, 7790-7794 showed inhibition of HIV-1 replication in 
both early-infected and chronically infected cell cultures, using nucleotide 
sequence-specific oligonucleotide phosphorothioates. Leither et al (1990) 
Proa Natl Acad. ScL USA 87, 3430-3434 report inhibition in tissue 
culture of influenza virus replication by oligonucleotide phosphorothioates. 
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Oligonucleotides having artificial linkages have been shown to be resistant 
to degradation in vivo. For example, Shaw et al (1991) in Nucleic Acids 
Res. 19, 747-750, report that otherwise unmodified oligonucleotides 
become more resistant to nucleases in vivo when they are blocked at the 
5 3' end by certain capping structures and that uncapped oligonucleotide 
phosphorothioates are not degraded in vivo. 

A detailed description of the H-phosphonate approach to synthesizing 
oligonucleoside phosphorothioates is provided in Agrawal and Tang (1990) 

10 Tetrahedron Letters 31, 7541-7544, the teachings of which are hereby 
incorporated herein by reference. Syntheses of oligonucleoside 
methylphosphonates, phosphorodithioates, phosphoramidates, phosphate 
esters, bridged phosphoramidates and bridge phosphorothioates are known 
in the art. See, for example, Agrawal and Goodchild (1987) Tetrahedron 

15 Letters 28, 3539; Nielsen et al (1988) Tetrahedron Letters 29, 291 1; Jager 
et al (1988) Biochemistry 27, 7237; Uznanski et al (1987) Tetrahedron 
Letters 28, 3401; Bannwarth (1988) Helv. Chim. Acta. 71, 1517; 
Crosstick and Vyle (1989) Tetrahedron Letters 30, 4693; Agrawal et al 
(1990) Proc. Natl. Acad. Scl USA 87, 1401-1405, the teachings of which 

20 are incorporated herein by reference. Other methods for synthesis or 
production also are possible. In a preferred embodiment the 
oligonucleotide is a deoxyribonucleic acid (DNA), although ribonucleic 
acid (RNA) sequences may also be synthesized and applied. 

25 The oligonucleotides useful in the invention preferably are designed to 
resist degradation by endogenous nucleolytic enzymes. In -vivo 
degradation of oligonucleotides produces oligonucleotide breakdown 
products of reduced length. Such breakdown products are more likely to 
engage in non-specific hybridization and are less likely to be effective, 

30 relative to their full-length counterparts. Thus, it is desirable to use 
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oligonucleotides that are resistant to degradation in the body and which are 
able to reach the targeted cells. The present oligonucleotides can be 
rendered more resistant to degradation in vivo by substituting one or more 
internal artificial internucleotide linkages for the native phosphodiester 
5 linkages, for example, by replacing phosphate with sulphur in the linkage. 
Examples of linkages that may be used include phosphorothioates, 
methylphosphonates, sulphone, sulphate, ketyl, phosphorodithioates, 
various phosphoramidates, phosphate esters, bridged phosphorothioates 
and bridged phosphoramidates. Such examples are illustrative, rather than 

10 limiting, since other internucleotide linkages are known in the art. See, 
for example, Cohen, (1990) Trends in Biotechnology. The synthesis of 
oligonucleotides having one or more of these linkages substituted for the 
phosphodiester internucleotide linkages is well known in the art, including 
synthetic pathways for producing oligonucleotides having mixed 

15 internucleotide linkages. 

Oligonucleotides can be made resistant to extension by endogenous 
enzymes by "capping" or incorporating similar groups on the 5' or 3' 
terminal nucleotides. A reagent for capping is commercially available as 
20 Amino-Link II™ from Applied BioSystems Inc, Foster City, CA. 
Methods for capping are described, for example, by Shaw et al (1991) 
Nucleic Acids Res. 19, 747^750 and Agrawal et al (1991) Proc. Natl 
Acad. ScL USA 88(17), 7595-7599, the teachings of which are hereby 
incorporated herein by reference. 

25 

A further method of making oligonucleotides resistant to nuclease attack 
is for them to be "self-stabilized" as described by Tang et al (1993) Nucl 
Acids Res. 21, 2729-2735 incorporated herein by reference. Self- 
stabilized oligonucleotides have hairpin loop structures at their 3' ends, 
30 and show increased resistance to degradation by snake venom 
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phosphodiesterase, DNA polymerase I and fetal bovine serum. The self- 
stabilized region of the oligonucleotide does not interfere in hybridization 
with complementary nucleic acids, and pharmacokinetic and stability 
studies in mice have shown increased in vivo persistence of self-stabilized 
5 oligonucleotides with respect to their linear counterparts. 

In accordance with the invention, the inherent binding specificity of 
antisense oligonucleotides characteristic of base pairing is enhanced by 
limiting the availability of the antisense compound to its intend locus in 

1 0 vivo, permitting lower dosages to be used and minimizing systemic effects . 
Thus, oligonucleotides are applied locally to achieve the desired effect. 
The concentration of the oligonucleotides at the desired locus is much 
higher than if the oligonucleotides were administered systemically, and the 
therapeutic effect can be achieved using a significantly lower total amount. 

15 The local high concentration of oligonucleotides enhances penetration of 
the targeted cells and effectively blocks translation of the target nucleic 
acid sequences. 

The oligonucleotides can be delivered to the locus by any means 
20 appropriate for localized administration of a drug. For example, a 
solution of the oligonucleotides can be injected directly to the site or can 
be delivered by infusion using an infusion pump. The oligonucleotides 
also can be incorporated into an implantable device which when placed at 
the desired site, permits the oligonucleotides to be released into the 
25 surrounding locus. 

The oligonucleotides are most preferably administered via a hydrogel 
material. The hydrogel is noninflammatory and biodegradable. Many 
such materials now are known, including those made from natural and 
30 synthetic polymers. In a preferred embodiment, the method exploits a 
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hydrogel which is liquid below body temperature but gels to form a shape- 
retaining semisolid hydrogel at or near body temperature. Preferred 
hydrogel are polymers of ethylene oxide-propylene oxide repeating. units. 
The properties of the polymer are dependent on the molecular weight of 

5 the polymer and the relative percentage of polyethylene oxide and 
polypropylene oxide in the polymer. Preferred hydrogels contain from 
about 10 to about 80% by weight ethylene oxide and from about 20 to 
about 90% by weight propylene oxide. A particularly preferred hydrogel 
contains about 70% polyethylene oxide and 30% polypropylene oxide. 

10 Hydrogels which can be used are available, for example, from BASF 
Corp., Parsippany, NJ, under the tradename Pluronic R . 

In this embodiment, the hydrogel is cooled to a liquid state and the 
oligonucleotides are admixed into the liquid to a concentration of about 1 
mg oligonucleotide per gram of hydrogel. The resulting mixture then is 
applied onto the surface to be treated, for example by spraying or painting 
during surgery or using a catheter or endoscopic procedures. As the 
polymer warms, it solidifies to form a gel, and the oligonucleotides diffuse 
out of the gel into the surrounding cells over a period of time defined by 
the exact composition of the gel. 

The oligonucleotides can be administered by means of other implants that 
are commercially available or described in the scientific literature, 
including liposomes, microcapsules and implantable devices. For 
25 example, implants made of biodegradable materials such as 
polyanhydrides, polyorthoesters, polylactic acid and polyglycolic acid and 
copolymers thereof, collagen, and protein polymers, or non-biodegradable 
materials such as ethylenevinyl acetate (EVAc), polyvinyl acetate, ethylene 
vinyl alcohol, and derivatives thereof can be used to locally deliver the 
30 oligonucleotides. The oligonucleotides can be incorporated into the 
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20 



37 

material as it is polymerized or solidified, using melt or solvent 
evaporation techniques, or mechanically mixed with the material. In one 
embodiment, the oligonucleotides are mixed into or applied onto coatings 
for implantable devices such as dextran coated silica beads, stents, or 
5 catheters. 

The dose of oligonucleotides is dependent on the size of the 
oligonucleotides and the purpose for which is it administered. In general, 
the range is calculated based on the surface area of tissue to be treated. 
10 The effective dose of oligonucleotide is somewhat dependent on the length 
and chemical composition of the oligonucleotide but is generally in the 
range of about 30 to 3000 fig per square centimetre of tissue surface area. 

The oligonucleotides may be administered to the patient systemically for 
15 both therapeutic and prophylactic purposes. The oligonucleotides may be 
administered by any effective method, for example, parenterally (eg 
intravenously, subcutaneously, intramuscularly) or by oral, nasal or other 
means which permit the oligonucleotides to access and circulate in the 
patient's bloodstream. Oligonucleotides administered systemically 
20 preferably are given in addition to locally administered oligonucleotides, 
but also have utility in the absence of local administration. A dosage in 
the range of from about 0.1 to about 10 grams per administration to an 
adult human generally will be effective for this purpose. 

25 It will be appreciated that the molecules of this aspect of the invention are 
useful in treating or preventing any infection caused by the microorganism 
from which the said gene has been isolated, or a close relative of said 
microorganism. Thus, the said molecule is an antibiotic. 

30 Thus, a twelfth aspect of the invention provides a molecule of the eleventh 
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aspect of the invention for use in medicine. 

A thirteenth aspect of the invention provides a method of treating a host 
which has, or is susceptible to, an infection with a microorganism, the 
5 method comprising administering an effective amount of a molecule 
according to the eleventh aspect of the invention wherein said gene is 
present in said microorganisms, or a close relative of said microorganism. 

By "effective amount" we mean an amount which substantially prevents 
10 or ameliorates the infection. By "host" we include any animal or plant 
which may be infected by a microorganism. 

It will be appreciated that pharmaceutical formulations of the molecule of 
the eleventh aspect of the invention form part of the invention. Such 
15 pharmaceutical formulations comprise the said molecule together with one 
or more acceptable carriers. The carrier(s) must be "acceptable" in the 
sense of being compatible with the said molecule of the invention and not 
deleterious to the recipients thereof. Typically, the carriers will be water 
or saline which will be sterile and pyrogen free. 

20 

As mentioned above, and as described in more detail in Example 4 below, 
I have found that certain virulence genes are clustered in Salmonella 
typhimurium in a region of the chromosome that I have called VGC2. 
DNA-DNA hybridisation experiments have determined that sequences 

25 homologous to at least part of VGC2 are found in many species and 
strains of Salmonella but are not present in the E. coli and Shigella strains 
tested (see Example 4). These sequences almost certainly correspond to 
conserved genes, at least in Salmonella, and at least some of which are 
virulence genes. It is believed that equivalent genes in other Salmonella 

30 species and, if present, equivalent genes in other enteric or other bacteria 
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will also be virulence genes. 



Whether a gene within the VGC2 region is a virulence gene is readily 
determined. For example, those genes within VGC2 which have been 

5 identified by the method of the second aspect of the invention (when 
applied to Salmonella typhimurium and wherein the environment is an 
animal such as a mouse) are virulence genes. Virulence genes are also 
identified by making a mutation in the gene (preferably a non-polar 
mutation) and determining whether the mutant strain is aviruient. 

10 Methods of making mutations in ar selected gene are well known and are 
described below. 

A fourteenth aspect of the invention provides the VGC2 DNA of 
Salmonella typhimurium or a part thereof, or a variant of said DNA or a 
15 variant of a part thereof. 

The VGC2 DNA of Salmonella typhimurium is depicted diagrammatically 
in Figure 8 and is readily obtainable from Salmonella typhimurium ATCC 
14028 (available from the American Type Culture Collection, 12301 

20 X ^Parkiawn Drive, Rockville, Maryland 20852, USA; also deposited at the 
NCTC, Public Health Laboratory Service, Colindale, UK under accession 
no. NCTC 12021) using the information provided in Example 4. For 
example, probes derived from the sequences shown in Figures 1 1 and 12 
may be used to identify X clones from a Salmonella typhimurium genomic 

25 library. Standard genome walking methods can be employed to obtain all 
of the VGC2 DNA. The restriction map shown in Figure 8 can be used 
to identify and locate DNA fragments from VGC2. 

By "part of the VGC2 DNA of Salmonella typhimurium" we mean any 
30 DNA sequence which comprises at least 10 nucleotides, preferably at least 
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20 nucleotides, more preferably at least 50 nucleotides, still more 
preferably at least 100 nucleotides, and most preferably at least 500 
nucleotides of VGC2, A particularly preferred part of the VGC2 DNA 
is the sequence shown in Figure 11, or a part thereof. Another 
5 particularly preferred part of the VGC2 DNA is the sequence shown in 
Figure 12, or a part thereof. 

Advantageously, the part of the VGC2 DNA is a gene, or part thereof. 

10 Genes can be identified within the-VGC2 region by statistical analysis of 
the open reading frames using computer programs known in the art. If an 
open reading frame is greater than about 100 codons it is likely to be a 
gene (although genes smaller than this are known). Whether an open 
reading frame corresponds to the polypeptide coding region of a gene can 

15 be determined experimentally. For example, a part of the DNA 
corresponding to the open reading frame may be used as a probe in a 
northern (RNA) blot to determine whether mRNA is expressed which 
hybridises to the said DNA; alternatively or additionally a mutation may 
be introduced into the open reading frame and the effect of the mutation 

20 on the phenotype of the microorganism can be determined. If the 
phenotype is changed then the open reading frame corresponds to a gene. 
Methods of identifying genes within a DNA sequence are known in the 
art. 

25 By "variant of said DNA or a variant of a part thereof 7 ' we include any 
variant as defined by the term "variant" in the seventh aspect of the 
invention. 

Thus, variants of VGC2 DNA of Salmonella typhimurium include 
30 equivalent genes, or parts thereof, from other Salmonella species, such as 
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Salmonella typhi and Salmonella enterica, as well as equivalent genes, or 
parts thereof, from other bacteria such as other enteric bacteria. 

By "equivalent gene" we include genes which are functionally equivalent 
5 and those in which a mutation leads to a similar phenotype (such as 
avirulence). It will be appreciated that before the present invention VGC2 
or the genes contained therein had not been identified and certainly not 
implicated in virulence determination. 

10 Thus, further aspects of the invention provide a mutant bacterium wherein 
if the bacterium normally contains a gene that is the same as or equivalent 
to a gene in VGC2, said gene is mutated or absent in said mutant 
bacterium; methods of making a mutant bacterium wherein if the 
bacterium normally contains a gene that is the same as or equivalent to a 

15 gene in VGC2, said gene is mutated or absent in said mutant bacterium. 
The following is a preferred method to inactivate a VGC2 gene. One first 
subclones the gene on a DNA fragment from a Salmonella X DNA library 
or other DNA library using a fragment of VGC2 as a probe in 
hybridisation experiments, and map the gene with respect to restriction 

20 enzyme sites and characterise the gene by DNA sequencing in Escherichia 
coli. Using restriction enzymes, one then introduces into the coding 
region of the gene a segment of DNA encoding resistance to an antibiotic 
(for example, kanamycin), possibly after deleting a portion of the coding 
region of the cloned gene by restriction enzymes. Methods and DNA 

25 constructs containing an antibiotic resistance marker are available to 
ensure that the inactivation of the gene of interest is preferably non-polar, 
that is to say, does not affect the expression of genes downstream from the 
gene of interest. The mutant version of the gene is then transferred from 
E. coli to Salmonella typhimurium usiing phage P22 transduction and 

30 transductants checked by Southern hybridisation for homologous 
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recombination of the mutant gene into the chromosome. 

This approach is commonly used in Salmonella (and can be used in & 
typhi), and further details can be found in many papers, including Galan 
5 etal (1992) 174, 4338-4349. 

Still further aspects provide a use of said mutant mutant bacterium in a 
vaccine; pharmaceutical compositions comprising said bacterium and a 
pharmaceutically acceptable carrier; a polypeptide encoded by VGC2 

10 DNA of Salmonella typhimurium or a part thereof, or a variant of a part 
thereof; a method of identifying a compound which reduces the ability of 
a bacterium to infect or cause disease in a host; a compound identifiable 
by said method; a molecule which selectively interacts with, and 
substantially inhibits the function of, a gene in VGC2 or a nucleic product 

15 thereof; and medical uses and pharmaceutical compositions thereof. 

The VGC2 DNA contains genes which have been identified by the 
methods of the first and second aspects of the invention as well as genes 
which have been identified by their location (although identifiable by the 

20 methods of the first and second aspects of the invention). These further 
aspects of the invention relate closely to the fourth, fifth, sixth, seventh, 
eighth, ninth, tenth, eleventh, twelfth and thirteenth aspects of the 
invention and, accordingly, the information given in relation to those 
aspects, and preferences expressed in relation to those aspects, applies to 

25 these further aspects. 

It is preferred if the gene is from VGC2 or is an equivalent gene from 
another species of Salmonella such as S. typhi. It is preferred if the 
mutant bacterium is a 5. typhimurium mutant or a mutant of another 
30 species of Salmonella such as S. typhi. 
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It is believed that at least some of the genes in VGC2 confer the ability for 
the bacterium, such as S. typhimurium, to enter cells. 

The invention will now be described with reference to the following 
5 Examples and Figures wherein: 

Figure 1 illustrates diagrammatically one particularly preferred method of 
the invention. 

10 Figure 2 shows a Southern hybridisation analysis of DNA from 12 S. 
typhimurium exconjugants following digestion with EcoRV. The filter was 
probed with the kanamycin resistance gene of the mini-Tn5 transposon. 

Figure 3 shows a colony blot hybridisation analysis of DNA from 48 S. 
15 typhimurium exconjugants from a half of a microtitre dish (A1-H6). The 
filter was hybridised with a probe comprising labelled amplified tags from 
DNA isolated from a pool of the first 24 colonies (A1-D6). 

Figure 4 shows a DNA colony blot hybridisation analysis of 95 S. 

20 typhimurium exconjugants of a microtitre dish (Al-Hll), which were 
injected into a mouse. Replicate filters were hybridised with labelled 
amplified tags from the pool (inoculum pattern), or with labelled amplified 
tags from DNA isolated from over 10,000 pooled colonies that were 
recovered from the spleen of the infected animal (spleen pattern). 

25 Colonies B6, Al 1 and C8 gave rise to weak hybridisation signals on both 
sets of filters. Hybridisation signals from colonies A3, C5, G3 (aroA), 
and F 10 are present on the inoculum pattern but not on the spleen pattern. 

Figure 5 shows the sequence of a Salmonella gene isolated using the 
30 method of the invention and a comparison to the Escherichia coli dp 
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protease genome. 

Figure 6 shows partial sequences of further Salmonella gene isolated using 
the method of the invention (SEQ ID Nos. 8 to 36). 

5 

Figure 7 shows the mapping of VGC2 on the S. typhimurium 
chromosome. (A) DNA probes from three regions of VGC2 were used 
in Southern hybridisation analysis of lysates from.a set of S. typhimurium 
strains harbouring locked in Mini-P22 prophages. Lysates which 

10 hybridised to a IS I* Pstl fragment (probe A in Figure 8) are shown. 
The other two probes used hybridised to the same lysates. (B) The 
insertion points and packaging directions of the phage are shown along 
with the map position in minutes (edition VIII, ref 22 in Example 4). The 
phage designations correspond to the following strains: I8P, TT15242; 

15 18Q, 15241; 19P, TT15244; 19Q, TT15243; 20P, TT15246 and 20Q, 
TT15245 (Ref in Example 4). The locations of mapped genes are shown 
by horizontal bars and the approximate locations of other genes are 
indicated. 

20 Figure 8 shows a physical and genetic map of VGC2. (A) The positions 
of 16 transposon insertions are shown above the line. The extent of 
VGC2 is indicated by the thicker line. The position and direction of 
transcription of ORFs described in the text of Example 4 are shown by 
arrows below the line, together with the names of similar genes, with the 

25 exception of ORFs 12 and 13 whose products are similar to the sensor and 
regulatory components respectively, of a variety of two component 
regulatory systems. (B) The location of overlapping clones and an 
EcoKl/Xbal restriction fragment from Mu^-P22 prophage strain TT15244 
are shown as filled bars. Only the portions of the X clones which have 

30 been mapped are shown and the clones may extend beyond these limits. 
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(C) The positions of restriction sites are marked: B, BamHl; E t EcoKL; 
V, EcoRV; H, Hindlll; P, Pstl and X, Xbah The positions of the 7.5 kb 
Pstl fragment (probe A) used as a probe in Figure 7 and that of the 2.2 
kb Pstl/Hindlll fragment (probe B) used as a probe in Figure 10 are 
5 shown below the restriction map. The positions of Sequence 1 (described 
in Figure 11) and Sequence 2 (described in Figure 12) are shown by the 
thin arrows (labelled Sequence 1 and Sequence 2). 



Figure 9 describes mapping the boundaries of VGC2. (A) The positions 
10 of mapped genes at minutes 37 to 38 on the E. coli K12 chromosome are 
aligned with the corresponding region of the S, typhimurium LT2 
chromosome (minutes 30 to 31). An expanded map of the VGC2 region 
is shown with 115. typhimurium (5. L ) DN A fragments used as probes 
(thick bars) and the restriction sites used to generate them: B, BamHl; C, 
15 Clal; H, Hindll; K, Kpnl; P, Pstl; N, Nsa and S, Sail. Probes that 
hybridised to E. coli K12 (E. c.) genomic DNA are indicated by +; those 
which failed to hybridise are indicated by 

Figure 10 shows that VGC2 is conserved among and specific to the 
20 Salmonellae. Genomic DNA from Salmonella serovars and other 
pathogenic bacteria was restricted with Pstl (A), Hindlll or EcoRV (B) 
and subjected to Southern hybridisation analysis, using a 2,2 kb 
Pstl/Hindlll fragment from X clone 7 as a probe (probe B Figure 2). The 
filters were hybridised and washed under stringent (A) or non-stringent 
25 (B) conditions. 



Figure 1 1 shows the DNA sequence of "Sequence 1 " of VGC2 from the 
centre to the left-hand end (see the arrow labelled Sequence i in Figure 
2), The DNA is translated in all six reading frames and the start and stop 
30 positions of putative genes, and the transposon insertion positions for 
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various mutants identified by STM are indicated (SEQ ID No 37). 

As is conventional a * indicates a stop codon and standard nucleotide 
ambiguity codes are used where necessary. 

5 

Figure 12 shows the DNA sequence of "Sequence 2" of VGC2 (cluster C) 
(see the arrow labelled Sequence 2 in Figure 2). The DNA is/translated 
in all six reading framesjand the start and stop positions of putative genes, 
and the transposon insertion positions for various mutants identified by 
10 STM are indicated (SEQ ID No 38). 

As is conventional a * indicates a stop codon and standard nucleotide 
ambiguity codes are used where necessary. 

15 Figures 7 to 12 are most relevant to Example 4. 

Example 1: Identification of virulence genes in Salmonella 
tvphimurium 

20 Materials and Methods 

Bacterial Strains and Plasmids 

Salmonella typhimurium strain 12023 (equivalent to American Type 
25 Culture Collection (ATCC) strain 14028) was obtained from the National 
Collection of Type Cultures (NCTC), Public Health Laboratory Service, 
Colindale, London, UK. A spontaneous nalidixic acid resistant mutant of 
this strain (12023 Nal^ was selected in our laboratory. Another derivative 
of strain 12023, CL1509 (aroA::TnIO) was a gift from Fred Heffron. 
30 Escherichia coli strains CC118 X/>/r (A[ara-leu], araD, AlacX74, galE, 
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galK, phoA20, thi-1, rpsE, rpoB, argE(Am), recAl, Xp/r phage lysogen) 
andS17-l Xp/r(Tp r , Sm\recA, thi,pro, hsdR>M + , RP4:2-Tc:Mu:KmTn7, 
}spir ) were gifts from Kenneth Tim mis. E. coli DH5a was used for 
propagating pUC18 (Gibco-BRL) and Bluescript (Stratagene) plasmids 
containing S. typhimurium DNA. Plasmid pUTmini-Tn5Km2 (de Lorenzo 
et al y 1990) was a gift from Kenneth Timmis. 

Construction of semi-random sequence tags and ligations 

The oligonucleotide pool RT1 (5 '-CTAGGTACCTAC AACCTC AAGCTT- 
D^lio-AAGCTTGGTTAGAATGGGTACC ATG-3 ') (SEQ ID No 1), and 
primers P2 (5'-TACCTACAACCTCAAGCT-3') (SEQ ID No 2), P3 (5'- 
CATGGTACCC ATTCTAAC-3 ') (SEQ ID No 3), P4 (5'- 
TACCC ATTCTAACC AAGC-3 ') (SEQ ID No 4) and P5 (5'- 
CTAGGTACCTAC AACCTC-3 ') (SEQ ID No 5) were synthesized on a 
oligonucleotide synthesizer (Applied Biosystems, model 380B). Double 
stranded DNA tags were prepared from RT1 in a 100 fA volume PCR 
containing L5 mM MgCl 2 50 mM KC1, and 10 mM Tris-Cl (pH 8.0) with 200 
pg of RT1 as target; 250 fiM each dATP, dCTP, dGTP, dTTP; 100 pM of 
primers P3 and P5; and 2.5 U of Amplitaq (Perkin-Eimer Cetus). Thermal 
cycling conditions were 30 cycles of 95°C for 30 s, 50°C for 45 s, and 72°C 
for 10 s. The PCR product was gel purified (Sambrook et al, 1989), passed 
through an elutipD column (available from Schleicher and Schull) and digested 
with Kpnl prior to ligation into pUC18 or pUTmini-Tn5Km2. For ligations, 
plasmids were digested with Kpnl and dephosphorylated with calf intestinal 
alkaline phosphatase (Gibco-BRL). Linearized plasmid molecules -were gel- 
purified (Sambrook et al y 1989) prior to ligation to remove any residual uncut 
plasmid DNA from the digestion. Ligation reactions contained approximately 
50 ng each of plasmid and double stranded tag DNA in a 25 /xl volume with 1 
unit T4 DNA ligase (Gibco-BRL) in a buffer supplied with the enzyme. 
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Ligations were carried out for 2 h at 24°C. To determine the proportion of 
bacterial colonies arising from either self ligation of the plasmid DNA or uncut 
plasmid DNA, a control reaction was carried out in which the double stranded 
tag DNA was omitted from the ligation reaction. This yielded no ampicillin 
5 resistant bacterial colonies following transformation of E. coli CC118 
(Sambrook et al, 1989), compared with 185 colonies arising from a ligation 
reaction containing the double stranded tag DNA. 

Bacterial Transformation and Matings 

10 

The products of several ligations between pUT mini-Tn5Km2 and the 
double stranded tag DNA were used to transform E. coli CC1 18 (Sambrook 
et al, 1989). A total of approximately 10,300 transformants were pooled 
and plasmid DNA extracted from the pool was used to transform E. coli S- 

15 17 >pir (de Lorenzo & Timmis, 1994). For mating experiments, a pool of 
approximately 40,000 ampicillin resistant E. coli S-17 >pir transformants, 
and & typhimurium 12023 NaT were cultured separately to an optical 
density (OD) 580 of L0. Aliquots of each culture (0.4 ml) were mixed in 5 
ml 10 mM MgS<\ and filtered through a Millipore membrane (0.45 firn 

20 diameter). The filters were placed on the surface of agar containing M9 
salts (de Lorenzo & Timmis, 1994) and incubated at 37°C for 16 h. The 
bacteria were recovered by shaking the filters in liquid LB medium for 40 
min at 37°C and exconjugants were selected by plating the suspension onto 
LB medium containing 100 /xg mi" 1 nalidixic acid (to select against the donor 

25 strain) and 50 fig ml" 1 kanamycin (to select for the recipient strain). Each 
exconjugant was checked by transferring nalidixic acid resistant" (nalO, 
kanamycin resistant (kan 1 ) colonies to MacConkey Lactose indicator medium 
(to distinguish between E. coli and S, typhimurium), and to LB medium 
containing ampicillin. Approximately 90% of the nal r , kan r colonies were 

30 sensitive to ampicillin, indicating that these resulted from authentic 
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transposition events (de Lorenzo & Timmis, 1994). Individual ampicillin- 
sensitive exconjugants were stored in 96 well microtitre dishes containing 
LB medium. For long term storage at "80°C, either 7% DMSO or 15% 
glycerol was included in the medium. 

5 

Phenotypic characterisation of mutants 

Mutants were replica plated from microtitre dishes onto solid medium 
containing M9 salts and 0.4% glucose (Sambrook et al, 1989) to identify 
10 auxotrophs. Mutants with rough colony morphology were detected by low 
magnification microscopy of colonies on agar plates. 

Colony Blots, DNA extractions, PCRs, DNA labelings and hybridisations 

15 For colony blot hybridizations, a 48-well metal replicator (Sigma) was used 
to transfer exconjugants from microtitre dishes to Hybond N nylon filters 
(Amersham, UK) that had been placed on the surface of LB agar containing 
50 fig ml" 1 kanamycin. After overnight incubation at 37°C, the filters 
supporting the bacterial colonies were removed and dried at room 

20 temperature for 10 min. The bacteria were lysed with 0.4 N NaOH and the 
filters washed with 0.5 N Tris-Cl pH 7.0 according to the filter 
manufacturer's instructions. The bacterial DNA was fixed to the filters by 
exposure to UV light from a Stratalinker (Stratagene). Hybridisations to 
32 P-labeiled probes were carried out under stringent conditions as previously 

25 described (Holden et al, 1989). For DNA extractions, S. typhimurium 
transposon mutant strains were grown in liquid LB medium in microtitre 
dishes or resuspended in LB medium following growth on solid media. 
Total DNA was prepared by the hexadecyltrimethylammoniumbromide 
(CTAB) method according to Ausubel et al (1987). Briefly, cells from 150 

30 to 1000 volumes were precipitated by centrifugation and resuspended in 
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576 fd TE. To this was added 15 fd of 20% SDS and 3 fd of 20 mg ml 1 
proteinase K. After incubating at 37 °C for 1 hour, 166 fd of 3 M NaCI 
was added and mixed thoroughly, followed by 80 fd of 10% (w/v) CTAB 
and 0.7 M NaCI. After thorough mixing, the solution was incubated at 
5 65 °C for 10 min. Following extraction with phenol and phenol-chloroform, 
the DNA was precipitated by addition of isopropanol, washed with 70% 
ethanol and resuspended in TE at a concentration of approximately 1 fig 

10 The DNA samples were subjected to two rounds of PCR to generate 
labelled probes. The first PCR was performed in 100 fd reactions 
containing 20 mM Tris-Cl pH 8.3; 50 mM KCl; 2 mM MgCl 2 ; 0.01% 
Tween 80; 200 fiM each dATP, dCTP, dGTP, dTTP; 2.5 units of Amplitaq 
polymerase (Perkin-Elmer Cetus); 770 ng each primer P2 and P4; and 5 fig 

15 target DNA. After an initial denaturation of 4 min at 95 °C , thermal cycling 
consisted of 20 cycles of 45 s at 50°C, 10 s at 72°C, and 30 s at 95°C. 
PCR products were extracted with chloroform/isoamyl alcohol (24/1) and 
precipitated with ethanol. DNA was resuspended in 10 fd TE and the PCR 
products were purified by electrophoresis through a 1.6% Seaplaque (FMC 

20 Byproducts) gel in TAE buffer. Gel slices containing fragments of about 
80 bp were excised and used for the second PCR. This reaction was 
carried out in a 20 fd total volume, and contained 20 mM Tris-Cl pH 8.3; 
50 mM KCl; 2 mM MgCl 2 ; 0.01% Tween 80; 50 ^M each dATP, dTTP, 
dGTP; 10 fd 32 P-dCTP (3000 Ci/mmol, Amersham); 150 ng each primer P2 

25 and P4; approximately 10 ng of target DNA (1-2 fd of 1.6% Seaplaque 
agarose containing the first round PCR product); 0.5 units of Amplitaq 
polymerase. The reaction was overiayed with 20 fd mineral oil and thermal 
cycling was performed as described above. Incorporation of the radioactive 
label was quantitated by absorbance to Whatman DE81 paper (Sambrook et 

30 al, 1989). 
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Infection Studies 

Individual Salmonella exconjugants containing tagged transposons were 
grown in 2% tryptone, 1% yeast extract, 0.92% v/v glycerol, 0,5% 

5 N^PC^, 1% KN0 3 (TYGPN medium) (Ausubel et al y 1987) in microtitre 
plates overnight at 37°C. A metal replicator was used to transfer a small 
volume of the overnight cultures to a fresh microtitre plate and the cultures 
were incubated at 37°C until the OD 5S0 (measured using a Titertek 
Multiscan microtitre plate reader) was approximately 0.2 in each welL 

10 Cultures from individual wells were then pooled and the OD 550 determined 
using a spectrophotometer. The culture was diluted in sterile saline to 
approximately SxlO 5 cfu ml 1 . Further dilutions were plated out onto 
TYGPN containing nalidixic acid (100 mg ml" 1 ) and kanamycin (50 mg ml" 1 ) 
to confirm the cfu present in the inoculum. 

15 

Groups of three female BALB/c mice (20-25g) were injected 
intraperitoneally with 0.2 ml of bacterial suspension containing 
approximately lxlO 5 cfu ml" 1 . Mice were sacrificed three days post- 
inoculation and their spleens were removed to recover bacteria. Half of 

20 each spleen was homogenized in 1 ml of sterile saline in a microfuge tube. 
Cellular debris was allowed to settle and 1 ml of saline containing cells still 
in suspension was removed to a fresh tube and centrifuged for two minutes 
in a microfuge. The supernatant was aspirated and the pellet resuspended 
in 1 ml of sterile distilled water. A dilution series was made in sterile 

25 distilled water and 100 ml of each dilution was plated onto TYGPN agar 
containing nalidixic acid (100 ug ml 1 ) and kanamycin (50 ug ml 1 ). Bacteria 
were recovered from plates containing between 1000 and 4000 colonies, and 
a total of over 10,000 colonies recovered from each spleen were pooled and 
used to prepare DNA for PCR generation of probes to screen colony blots. 

30 
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Virulence gene cloning and DNA sequencing 

Total DNA was isolated from S. typhimurium exconjugants and digested 
separately with SstI, Sail, Pstl and Sphl. Digests were fractionated through 

5 agarose gels, transferred to Hybond N + membranes (Amersham) and 
subjected to Southern hybridisation analysis using the kanamycin resistance 
gene of pUT mini-TnJKm2 as a probe. The probe was labelled with 
digoxygenin (Boehringer-Mannheim) and chemiluminescence detection was 
carried out according to the manufacturer's instructions. The hybridisation 

10 and washing conditions were as described above. Restriction enzymes which 
gave rise to hybridising fragments in the 3-5 kb range were used to digest 
DNA for a preparative agarose gel, and DNA fragments corresponding to 
the sizes of the hybridisation signals were excised from this, purified and 
ligated into pUC18. Ligation reactions were used to transform E. coli 

15 DH5a to kanamycin resistance. Plasmids from kanamycin-resistant 
transformants were purified by passage through an elutipD column and 
checked by restriction enzyme digestion. Plasmid inserts were partially 
sequenced by the di-deoxy method (Sanger et al, 1977) using the -40 primer 
and reverse sequencing primer (United States Biochemical Corporation) and 

20 the primers P6 (5 '-CCTAGGCGGCC AGATCTG AT-3 ') (SEQ ID No 6) and 
P7 (5'GCACTTGTGTATAAGAGTCAG-3 ') (SEQ ID No 7) which anneal 
to the I and O termini of Tn5, respectively. Nucleotide sequences and 
deduced amino acid sequences were assembled using the Macvector 3.5 
software package run on a Macintosh SE/30 computer. Sequences were 

25 compared with the EMBL and Genbank DNA databases using the 
UNIX/SUN computer system at the Human Genome Mapping -Project 
Resource Centre, Harrow, UK. 
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Results 
Tag Design 

5 The structure of the DNA tags is shown in Figure la. Each tag consists of 
a variable central region flanked by "arms", of invariant sequence. The 
central region sequence ([NK]^ was designed to prevent the occurrence of 
sites for the commonly used 6 bp-recognition restriction enzymes, but is 
sufficiently variable to ensure that statistically, the same sequence should 

10 only occur once in 2 x 10 11 molecules (DNA sequencing of 12 randomly 
selected tags showed that none shared more than 50% identity over the 
variable region). (N means any base (A, G, C or T) and K means G or T.) 
. The arms contain Kpnl sites close to the ends to facilitate the initial cloning 
step, and the HindlTL sites bordering the variable region were used to release 

15 radioiabelled variable regions from the arms prior to hybridisation analysis. 
The arms were also designed such that primers P2 and P4 each contain only 
one guanine residue. Therefore during a PCR using these primers, only one 
cytosine will be incorporated into each newly synthesised arm, compared to 
an average of ten in the unique sequence. When radioiabelled dCTP is 

20 included in the PCR, an average of ten-fold more label will be present in 
the unique sequence compared with each arm. This is intended to minimise 
background hybridisation signals from the arms, after they have been 
released from the unique sequences by digestion with ffindlll. Double 
stranded tags were ligated into the Kpnl site of the mini-Tn5 transposon 

25 Km2, carried on plasmid pUT (de Lorenzo & Timmis, 1994). Replication 
of this plasmid is dependent on the R6K-specified it product of the pir gene. 
It carries the oriT sequence of the RP4 plasmid, permitting transfer to a 
variety of bacterial species (Miller & Mekalanos, 1988), and the tnp* gene 
needed for transposition of the mini-Tn5 element. The tagged mini-Tn5 

30 transposons were transferred to S. typhimurium by conjugation, and 288 
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exconjugants resulting from transposition events were stored in the wells of 
microtitre dishes. Total DNA isolated from 12 of these was digested with 
EcoRW, and subjected to Southern hybridisation analysis using the 
kanamycin resistance gene of the mini-Tn5 transposon as a probe. In each 
5 case, the exconjugant had arisen as a result a single integration of the 
transposon into a different site of the bacterial genome (Figure 2). 

Specificity and sensitivity studies 

10 We next determined the efficiency and uniformity of amplification of the 
DNA tags in PCRs involving pools of exconjugant DNAs as targets for the 
reactions. In an attempt to minimise unequal amplification of tags in the 
PGR, we determined the maximum quantity of DNA target that could be 
used in a 100 ftl reaction, and the minimum number of PCR cycles, that 

15 resulted in products which could be visualised by ethidium bromide staining 
of an agarose gel (5 fig DNA and 20 cycles, respectively). 

S. typhimurium exconjugants which had reached stationary growth phase in 
microtitre dishes were combined, and used to extract DNA. This was 

20 subjected to a PCR using primers P2 and P4. PCR products of 80 bp were 
gel-purified and used as targets for a second PCR, using the same primers 
but with 32 P-labelled CTP. This resulted in over 60% of the radiolabelled 
dCTP being incorporated into the PCR products. The radiolabelled 
products were digested with ffindlll and used to probe colony blotted DNA 

25 from their corresponding microtitre dishes. Of the 1510 mutants tested in 
this way, 358 failed to yield a clear signal on an autoradiogram following 
an overnight exposure of the colony blot. There are three potential 
explanations for this. Firstly, it is possible that a proportion of the 
transposons did not carry tags. However, by comparing the transformation 

30 frequencies resulting from ligation reactions involving the transposon in the 
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presence or absence of tags, it seems unlikely that untagged transposons 
could account for more than approximately 0.5% of the total (see Materials 
and Methods). More probable causes are that the variable sequence was 
truncated in some of the tags, and/or that some of the sequences formed 

5 secondary structures, both of which might have prevented amplification. 
Mutants which failed to give clear signals were not included in further 
studies. The specificity of the efficiently amplifiable tags was demonstrated 
by generating a probe from 24 colonies of a microtitre dish, and using it to 
probe a colony blot of 48 colonies, which included the 24 used to generate 

10 the probe. The lack of any hybridisation signal from the 24 colonies not 
used to generate the probe (Figure 3) shows that the hybridisation conditions 
employed were sufficiently stringent to prevent cross-hybridisation among 
labelled tags, and suggests that each exconjugant is not reiterated within a 
microtitre dish. 

15 

There are further considerations in determining the maximum pool size that 
can be used as an inoculum in animal experiments. As the quantity of 
labelled tag for each transposon is inversely proportional to the complexity 
of the tag pool, there is a limit to the pool size above which hybridisation 

20 signals become too weak to be detected after overnight exposure of an 
autoradiogram. More importantly, as the complexity of the pool increases, 
so must the likelihood of failure of a virulent representative of the pool to 
be present in sufficient numbers, in the spleen of an infected animal, to 
produce enough labelled probe. We have not determined the upper limit for 

25 pool size in the murine model of salmonellosis that we have employed, but 
it must be in excess of 96. 

Virulence tests of the transposon mutants 

30 A total of 1152 uniquely tagged insertion mutants (from two microtitre 
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dishes) were tested for virulence in BALB/c mice in twelve pools, each 
representing a 96-well microtitre dish. Animals received an intraperitoneal 
injection of approximately 10 3 cells of each of 96 transposon mutants of a 
microtitre dish (10 5 organisms in total). Three days after injection mice 
were sacrificed, and bacteria were recovered by plating spleen homogenates 
onto laboratory medium. Approximately 10,000 colonies recovered from 
each mouse were pooled and DNA was extracted- The tags present in this 
DNA sample were amplified and labelled by the PCR, and colony blots 
probed and compared with the hybridisation pattern obtained using tags 
amplified from the inoculum (Figure 3)- As a control, an aroA mutant of 
S. typhimurium was tagged and employed as one of the 96 mutants in the 
inoculum. This strain would not be expected to be recovered in the spleen 
because its virulence is severely attenuated (Buchmeier et al, 1993), Forty- 
one mutants were identified whose DNA hybridized to labelled tags from 
the inoculum but not from labelled tags from bacteria recovered from the 
spleen. The experiment was repeated and the same forty-one mutants were 
again identified. Two of these were the aroA mutant (one per pool), as 
expected. Another was an auxotrophic mutant (it failed to grow on minimal 
medium). All of the mutants had normal colony morphology. 

Example 2: Cloning and partial characterisation of sequences flanking 
the transposon 

DNA was extracted from one of the mutants described in Example 1 (Pool 
1, F10), digested with Sstl, and subcloned on the basis of kanamycin 
resistance. The sequence of 450 bp flanking one end of the transposon was 
determined using primer P7. This sequence shows 80% identity to the E. 
coli clp (Ion) gene, which encodes a heat-regulated protease (Figure 5). To 
our knowledge, this gene has not previously been implicated as a virulence 
determinant. 
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Partial sequences of thirteen further Salmonella typhimurium virulence genes 
are shown in Figure 6 (sequences A2 to A9 and Bl to B5). Deduced amino 
acid sequences of P2D6, S4C3, P3F4, P7G2 and P9B7 bear similarities to 
a family of secretion-associated proteins that have been conserved 
5 throughout bacterial pathogens of animals and plants, and which are known 
in Salmonella as the inv family. In & typhimurium the inv genes are 
required for bacterial invasion into intestinal tissue. The virulence of inv 
mutants is attenuated when they are inoculated by the oral route, but not 
when they are administered intraperitoneally. The discovery of mv-related 
10 genes that are required for virulence following intraperitoneal inoculation 
suggests a new secretion apparatus which might be required for invasion of 
non-phagocytic cells of the spleen and other organs. The products of these 
new genes might represent better drug targets than the inv proteins in the 
treatment of established infections. 

15 

Further characterisation of the genes identified in this example is described 
in Example 4. 

Example 3: LD 5 ^ determinations and mouse vaccination study 

20 

Mutations identified by the method of the invention attenuate virulence. 

Five of the mutations in genes not previously implicated in virulence were 
transferred by P22-mediated transduction to the nalidixic acid-sensitive 

25 parent strain of S. typhimurium 12028. Transductants were checked by 
restriction mapping then injected by the intraperitoneal route into groups of 
BALB/c mice to determine their 50% lethal dose (LD^). The LD^ values 
for mutants S4C3, P7G2, P3F4 and P9B7 were all several orders of 
magnitude higher than that of the wild-type strain. No difference in the 

30 LD^ was detected for mutant P1F10; however, there was a statistically 
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significant decrease in the proportion of P1F10 cells recovered from the 
spleens of mice injected with an inoculum consisting of an equal proportion 
of this strain and the wild-type strain. This implies that this mutation does 
attenuate virulence, but to a degree that is not detectable by LD^. 

5 

Mutants P3F4 and P9B7 were also administered by the oral route at an 
inoculum level of 10 7 cells/mouse. None of the mice became ill, indicating 
that the oral LD^ levels of these mutants are at least an order of magnitude 
higher than that of the wild-type strain. 

10 

In the mouse vaccination study groups of five female BALB/c mice of 20-25 
g in mass were initially inoculated orally (p.o.) or intraperitoneally (Lp.) 
with serial ten fold dilutions of Salmonella typhimurium mutant strains P3F4 
and P9B7. After four weeks the mice were then inoculated with 500 c.f.u. 
15 of the parental wild type strain. Deaths were then recorded over four 
weeks. 

A group of two mice of the same age and batch as the mice inoculated with 
the mutant strains were also inoculated i.p. with 500 c.f.u. of the wild type 
20 strain as a positive control. Both non-immunised mice died as expected 
within four weeks. 

Results are tabulated below: 

25 1) p.o. initial inoculation with mutant strain P3F4 



initial inoculum in 
c.f.u. 


no. mice surviving 
first challenge 


no. mice surviving 
wild type challenge 


5 x 10 9 


5 


2(40%) 


5 x 10 8 


5 


2 (40%) 
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5 x 10 7 


5 


0 (0%) 


2) i.p. initial inoculum with mutant strain P3F4 


initial inoculum in 
cf.u. 


no. mice surviving 
first challenge 


no. mice surviving 
wild type challenge 


5 x 10 6 


3 


3 (100%) 


5 x 10 5 


5 


4 (80%) 


5x 10 4 


6 


5 (83%) 


5X 10 3 


5 


4 (80%) 


3) p.o. initial inoculum with mutant strain P9B7 


initial inoculum in 
cf.u. 


no. mice surviving 
first challenge 


no. mice surviving 
wild type challenge 


5 x 10 9 


5 


0 (0%) 


4) i.p. initial inoculum with mutant P9B7 


initial inoculum in 
cf.u. 


no. mice surviving 
first challenge 


no. mice surviving 
wild type challenge 


5 x 10 6 


4 


2 (50%) 



From these experiments I conclude that mutant P3P4 appears to give some 
25 protection against subsequent wild type challenge. This protection appears 
greater in mice that were immunised i.p. 



Example 4: Identification of a virulence locus encoding a second type 
III secretion system in Salmonella typhimurium 

30 

Abbreviations used in this Example are VGC1, virulence gene cluster 1; 
VGC2, virulence gene cluster 2. 
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Background to the experiments described 



Salmonella typhimurium is a principal agent of gastroenteritis in humans and 
produces a systemic illness in mice which serves as a model for human 
typhoid fever (1). Following oral inoculation of mice with & typhimurium, 
the bacteria pass from the lumen of the small intestine through the intestinal 
mucosa, via enterocytes or M cells of the Peyer's patch follicles (2). The 
bacteria then invade macrophages and neutrophils, enter the 
reticuloendothelial system and disseminate to other organs, including the 
spleen and liver, where farther reproduction results in an overwhelming and 
fatal bacteremia (3). To invade host cells, to survive and replicate in a 
variety of physiologically stressful intracellular and extracellular 
environments and to circumvent the specific antibacterial activities of the 
immune system, S. typhimurium employs a sophisticated repertoire of 
virulence factors (4). 

To gain a more comprehensive understanding of virulence mechanisms of 
S. typhimurium and other pathogens the transposon mutagenesis system 
described in Example 1, which is conveniently called 'signature-tagged 
mutagenesis' (STM), which combines the strength of mutational analysis 
with the ability to follow simultaneously the fate of a large number of 
different mutants within a single animal (5 and Example 1; Reference 5 was 
published after the priority date for this invention). Using this approach we 
identified 43 mutants with attenuated virulence from a total of 1 152 mutants 
that were screened. The nucleotide sequences of DNA flanking the 
insertion points of transposons in 5 of these mutants showed that they were 
related to genes encoding type III secretion systems of a variety of bacterial 
pathogens (6, 7). The products of the inv/spa gene cluster of S. 
typhimurium (8, 9) are proteins that form a type III secretion system 
required for the assembly of surface appendages mediating entry into 
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epithelial cells (10). Hence the virulence of strains carrying mutations in 
the inv/spa cluster is attenuated only if the inoculum is administered orally 
and not when given intraperitoneally (8), In contrast the 5 mutants 
identified by STM are avirulent following intraperitoneal inoculation (5), 

5 

In this example we show that the transposon insertion points of these 5 
mutants and an additional 1 1 mutants identified by STM all map to the same 
region of the & typhimurium chromosome. Further analysis of this region 
reveals additional genes whose deduced products have sequence similarity 
10 to other components of type III -secretion systems- This chromosomal 
region which we refer to as virulence gene cluster 2 (VGC2) is not present 
in a number of other enteric bacteria, and represents an important locus for 
S, typhimurium virulence. 

15 Materials and Methods 

Bacterial Strains, Transduction and Growth Media. Salmonella enterica 
serotypes 5791 {aberdeen), 423180 (gallinarum), 7101 (cuband) and 12416 
(typhimurium LT2) were obtained from the National Collections of Type 

20 Cultures, Public Health Laboratory Service, UK. Salmonella typhi BRD123 
genomic DNA was a gift from G. Dougan, enteropathogenic Escherichia 
coli (EPEC), enterohemorrhagic E. coli (EHEC), Vibrio cholera biotype El 
Tor, Shigella flexneri serotype 2 and Staphylococcus aureus were clinical 
isolates obtained from the Department of Infectious Diseases and 

25 Bacteriology, Royal Postgraduate Medical School, UK. Genomic DNA 
from Yersinia pestis was a gift from J. Heesemann. However, genomic 
DNA can be isolated using standard methods. The bacterial strains and the 
methods used to generate signature-tagged mini-Tn5 transposon mutants of 
£ typhimurium NCTC strain 12023 have been described previously (5, 11). 

30 Routine propagation of plasmids was in E. coli DH5or. Bacteria were 
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grown in LB broth (12) supplemented with the appropriate antibiotics. 
Before virulence levels of individual mutant strains were assessed, the 
mutations were first transferred by phage P22 mediated transduction (12) to 
the nalidixic acid sensitive parental strain of S. typhimurium 12023. 
5 Transductants were analysed by restriction digestion and Southern 
hybridisation before use as inoculum. 

Lambda Library Screening. Lambda (X) clones with overlapping insert 
DNAs covering VGC2 were obtained by standard methods (13) from a 
10 X1059 library (14) containing inserts from a partial Sau3A digest of S. 
typhimurium LT2 genomic DNA. The library was obtained via K. 
Sanderson, from the Salmonella Genetic Stock Centre (SGSC), Calgary, 
Canada. 

15 Miu/-P22 Lysogens. Radiolabeled DNA probes were hybridised to 
Hybond N (Amersham) filters bearing DNA prepared from lysates of a set 
of S. typhimurium strains harbouring Mu^?-P22 prophages at known 
positions in the S. typhimurium genome. Preparation of mitomycin-induced 
Mu<i-P22 lysates was as described (12, 15). The set of Miui-P22 prophages 

20 was originally assembled by Benson and Goldman (16) and was obtained 
from the SGSC. 

Gel Electrophoresis and Southern Hybridisation. Gel electrophoresis was 
performed in 1% or 0.6% agarose gels run in 0.5 x TBE. Gel fractionated 

25 DNA was transferred to Hybond N or N+ membranes (Amersham) and 
stringent hybridisation and washing procedures (permitting hybridisation 
between nucleotide sequences with 10% or less mismatches) were as 
described by Holden et al, (17). For non-stringent conditions (permitting 
hybridisation between sequences with 50% mismatches) filters were 

30 hybridised overnight at 42°C in 10% formamide/0.25 M Na 2 HPQ 4 /7% SDS 
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and the most stringent step was with 20 mM Na 2 HP0 4 /l% SDS at 42° C. 
DNA fragments used as probes were labelled with pPjdCTP using the 
'Radprime' system (Gibco-BRL) or with [digoxigenin-1 l]dUTP and detected 
using the Digoxigenin system (Boehringer Mannheim) according to the 
5 manufacturers' instructions, except that hybridisation was performed in the 
same solution as that used for radioactively labelled probes. Genomic DNA 
was prepared for Southern hybridisation as described previously (13). 

Molecular Cloning and Nucleotide Sequencing. Restriction endonucleases 
10 and T4 DNA ligase were obtained from Gibco-BRL. General molecular 
biology techniques were as described in Sambrook et al, (18). Nucleotide 
sequencing was performed by the dideoxy chain termination method (19) 
using a T7 sequencing kit (Pharmacia). Sequences were assembled with the 
MacVector 3.5 software or AssemblyLIGN packages. Nucleotide and 
15 derived amino acid sequences were compared with those in the European 
Molecular Biology Laboratory (EMBL) and SwissProt databases using the 
BLAST and FASTA programs of the GCG package from the University of 
Wisconsin (version 8) (20) on the network service at the Human Genome 
Mapping Project Resource Centre, Hinxton, UK. 

20 

Virulence Tests. Groups of five female BALB/c mice (20-25g) were 
inoculated orally (p.o.) or intraperitoneally (i.p.) with 10-fold dilutions of 
bacteria suspended in physiological saline. For preparation of the inoculum, 
bacteria were grown overnight at 37 °C in LB broth with shaking (50 rpm) 

25 and then used to inoculate fresh medium for various lengths of time until an 
optical density (OD) at 560 nm of 0.4 to 0.6 had been reached. For cell 
densities of 5 x 10 8 colony forming units (cfu) per ml and above, cultures 
were concentrated by centrifugation and resuspended in saline. The 
concentration of cfu/ml was checked by plating a dilution series of the 

30 inoculum onto LB agar plates. Mice were inoculated i.p. with 0.2 ml 
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volumes and p.o. by gavage with the same volume of inoculum. The LD^ 
values were calculated after 28 days by the method of Reed and Meunch 
(21). 



5 Results 

Localisation of Transposon Insertions, The generation of a bank of 
Salmonella typhimurium mini Tn5 transposon mutants and the screen used 
to identify 43 mutants with attenuated virulence have been described 

10 previously (5). Transposons and flanking DNA regions were cloned from 
exconjugants by selection for kanamycin resistance or by inverse PCR. 
Nucleotide sequences of 300-600 bp of DNA flanking the transposons were 
obtained for 33 mutants. Comparison of these sequences with those in the 
DNA and protein databases indicated that 14 mutants resulted from 

15 transposon insertions into previously known virulence genes, 7 arose from 
insertions into new genes with similarity to known genes of the 
enterobacteria and 12 resulted from insertions into sequences without 
similarity to entries in the DNA and protein databases (ref. 5, Example 1 
and this Example). 

20 

Three lines of evidence suggested that 16 of 19 transposon insertions into 
new sequences were clustered in three regions of the genome, initially 
designated A, B and C. First, comparing nucleotide sequences from regions 
flanking transposon insertion points with each other and with those in the 

25 databases showed that some sequences overlapped with one another or had 
strong similarity to different regions of the same gene* Second, Southern 
analysis of genomic DNA digested with several restriction enzymes and 
probed with restriction fragments flanking transposon insertion points 
indicated that some transposon insertions were located on the same 

30 restriction fragments. Third, when the same DNA probes were hybridised 
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to plaques from a S. typhimurium X DNA library, the probes from mutants 
which the previous two steps had suggested might be linked were found to 
hybridise to the same X DNA clones. Thus two mutants (P9B7 and P12F5) 
were assigned to cluster A, five mutants (P2D6, P9B6, PI LC3, PI 1D10 and 
5 P11H10) to cluster B and nine mutants (P3F4, P4F8, P7A3, P7B8, P7G2, 
P8G12, P9G4, P10E11 and P11B9) to cluster C (Figure 8). 

Hybridisation of DNA probes from these three clusters to lysates from a set 
of S. typhimurium strains harbouring locked-in Mik?-P22 prophages (15, 16) 

10 showed that the three loci were all located in the minute 30 to 31 region 
(edition VIII, ref. 22) (Figure 7), indicating that the three loci were closely 
linked or constituted one large virulence locus. To determine if any of the 
X clones covering clusters A, B and C contained overlapping DNA inserts, 
DNA fragments from the terminal regions of each clone were used as 

15 probes in Southern hybridisation analysis of the other X clones. Hybridising 
DNA fragments showed that several X clones overlap and that clusters A, 
B and C comprise one contiguous region (Figure 8). DNA fragments from 
the ends of this region were then used to probe the X library to identify 
further clones containing inserts representing the adjacent regions. No X 

20 clones were identified that covered the extreme right hand terminus of the 
locus so this region was obtained by cloning a 6.5 kb EcoRI/Xbal fragment 
from a lysate of the MuJ-P22 prophage strain TT15244 (16). 

Restriction mapping and Southern hybridisation analysis were then used to 
25 construct a physical map of this locus (Figure 8). To distinguish this locus 
from the well characterised inv/spa gene cluster at minute 63 (edition VIII, 
ref. 22) (8, 9, 23, 24, 25, 26), we refer to the latter as virulence gene 
cluster 1 (VGC1) and have termed the new virulence locus VGC2. Figure 
2 shows the position of two portions of DNA whose nucleotide sequence 
30 has been determined ("Sequence 1" and "Sequence 2"). The nucleotide 
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sequence is shown in Figures 11 and 12. 

Mapping the boundaries of VGC2 on the S. typhimurium chromosome. 

Nucleotide sequencing of X clone 7 at the left hand side of VGC2 revealed 

5 the presence of an open reading frame (ORF) whose deduced amino acid 
sequence is over 90% identical to the derived product of a segment of the 
ydhE* gene of E. coli and sequencing of the 6.5 kb EcoEI/Xbal cloned 
fragment on the right hand side of VGC2 revealed the presence of an ORF 
whose predicted amino acid sequence is over 90% identical to pyruvate 

10 kinase I of E. coli encoded by -the pykF gene (27). On the E. coli 
chromosome ydhE and pykF are located close to one another, at minute 37 
to 38 (28). Eleven non-overlapping DNA fragments distributed along the 
length of VGC2 were used as probes in non-stringent Southern hybridisation 
analysis of E. coli and S. typhimurium genomic DNA. Hybridising DNA 

15 fragments showed that a region of approximately 40 kb comprising VGC2 
was absent from the E. coli genome and localised the boundaries of VGC2 
to within 1 kb (Figure 9). Comparison of the location of the Xbal site close 
to the right hand end of VGC2 (Figure 8) with a map of known Xbal sites 
(29) at the minute 30 region of the chromosome (22) enables a map position 

20 of 30,7 minutes to be deduced for VGC2. 

Structure of VGC2. Nucleotide sequencing of portions of VGC2 has 
revealed the presence of 19 ORFs (Figure 8). The G+C content of 
approximately 26 kb of nucleotide sequence within VGC2 is 44.6%, 
25 compared to 47% for VGC1 (9) and 51-53% estimated for the entire 
Salmonella genome (30). 

The complete deduced amino acid sequences of ORFs 1-11 are similar to 
those of proteins of type III secretion systems (6, 7), which are known to 
30 be required for the export of virulence determinants in a variety of bacterial 
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pathogens of plants and animals (7). The predicted proteins of ORFs 1-8 
(Figure 8) are similar in organisation and sequence to the products of the 
yscN-U genes of Yersinia pseudotuberculosis (31), to invC/spaS of the 
inv/spa cluster in VGC1 of Salmonella typhimurium (8, 9) and to 
5 spa47/spa40 of the spa/mxi cluster of Shigella flexneri (32, 33, 34, 35,). 
For example the predicted amino acid sequence of ORF 3 (Figure 8) is 50% 
identical to YscS of Y. pseudotuberculosis (31), 34% identical to Spa9 from 
S. flexneri (35) and 37% identical to SpaQ of VGC1 of S. typhimurium (9), 
The predicted protein product of ORF9 is closely related to the LcrD family 
10 of proteins with 43% identity to LcrD of F. enterocolitica (36), 39% 
identity to MxiA of S. flexneri (32) and 40% identity to InvA of VGC1 
(23). Partial nucleotide sequences for the remaining ORFs shown in Figure 
8 indicate that the predicted protein from ORF10 is most similar to K 
enterocolitica YscJ (37) a lipoprotein located in the bacterial outer 
15 membrane, with ORF11 similar to S. typhimurium InvG, a member of the 
PulD family of translocases (38). ORF12 and ORF13 show significant 
similarity to the sensor and regulatory subunits respectively, from a variety 
of proteins comprising two component regulatory systems (39). There is 
ample coding capacity for further genes between ORFs 9 and 10, ORFs 10 
20 and 11, and between ORF 19 and the right hand end of VGC2. 

VGC2 is conserved among and is specific to the SahnoneUae. A 2.2 kb 
Pstl/Hindlll fragment located at the centre of VGC2 (probe B, Figure 8) 
lacking sequence similarity to entries in the DNA and protein databases was 

25 used as a probe in Southern hybridisation analysis of genomic DNA from 
Salmonella serovars and other pathogenic bacteria (Figure 10 A). - DNA 
fragments hybridising under non-stringent conditions showed that VGC2 is 
present in S. aberdeen, S. gallinarum, S. cubana, 5. typhi and is absent 
from EPEC, EHEC, Y, pestis, S. flexneri, V, cholera and S. aureus. Thus 

30 VGC2 is conserved among and is likely to be specific to the Salmonellae, 
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To determine if the organisation of the locus is conserved among the 
Salmonella serovars tested, stringent Southern hybridisations with genomic 
DNA digested with two further restriction enzymes were carried out. 
Hybridising DNA fragments showed that there is some heterogeneity in the 
5 arrangement of restriction sites between S. typhimurium LT2 and S. 
gallinarum, S. cubana and S. typhi (Figure 10B). Furthermore, 5". 
gallinarum and S. typhi contain additional hybridising fragments to those 
present in the other Salmonellae examined, suggesting that regions of VGC2 
have been duplicated in these species. 

10 

VGC2 is required for virulence in mice. Previous experiments showed 
that the LD^ values for i.p. inoculation of transposon mutants P3F4, P7G2, 
P9B7 and PI 1C3 were at least 100-fold greater than the wild type strain (5). 
In order to clarify the importance of VGC2 in the process of infection, the 

15 p.o. and i.p. LD^ values for mutants P3F4 and P9B7 were determined 
(Table 1). Both mutants showed a reduction in virulence of at least five 
orders of magnitude by either route of inoculation in comparison with the 
parental strain. This profound attenuation of virulence by both routes of 
inoculation demonstrates that VGC2 is required for events in the infective 

20 process after epithelial cell penetration in BALB/c mice. 



Table 1. LD^ values of S. typhimurium strains. 





LDso (cfu) 


Strain 


i.p. 


p.o. 


12023 wild type 


4.2 


6.2 x 10 4 


P3F4 


1.5 x 10 6 


>5 x 10 9 


P9B7 


>1.5 x 10 6 


>5 x 10 9 



cfu, colony forming units 
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Discussion 

A hitherto unknown virulence locus in S. typhimurium of approximately 40 
kb located at minute 30.7 on the chromosome by mapping the insertion 

5 points of a group of signature-tagged transposon mutants with attenuated 
virulence has been identified (5). This locus is referred to as virulence gene 
cluster 2 (VGC2) to distinguish it from the inv/spa virulence genes at 63 
minutes (edition VIII, ref . 22) which we suggest be renamed VGC 1 . VGC 1 
and VGC2 both encode components of type III secretion systems. 

10 However, these secretion systems -are functionally distinct. 

Of 19 mutants that arose from insertions into new genes (ref. 5 and this 
example) 16 mapped to the same region of the chromosome. It is possible 
that mini-Tn5 insertion occurs preferentially in VGC2. Alternatively, as the 

15 negative selection used to identify mutants with attenuated virulence (5) was 
very stringent (reflected by the high LD^ values for VGC2 mutants) it is 
possible that, among the previously unknown genes, only mutations in those 
of VGC2 result in a degree of attenuation sufficient to be recovered in the 
screen. The failure of previous searches for & typhimurium virulence 

20 determinants to identify VGC2 might stem from reliance on cell culture 
assays rather than a live animal model of infection. A previous study which 
identified regions of the S. typhimurium LT2 chromosome unique to 
Salmonellae (40) located one such region (RF333) to minutes 30.5 - 32. 
Therefore, RF333 may correspond to VGC2, although it was not known 

25 that RF333 was involved in virulence determination. 

Comparisons with the type III secretion systems encoded by the virulence 
plasmids of Yersinia and Shigella as well as with VGC1 of Salmonella 
indicates that VGC2 encodes the basic structural components of the 
30 secretory apparatus. Furthermore, the order of ORFs 1-8 in VGC2 is the 
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same as the gene order in homologues in Yersinia, Shigella and VGC1 of 
S. typhimurium. The fact that the organisation and structure of the VGC2 
secretion system is no more closely related to VGC1 than to the 
corresponding genes of Yersinia, together with the low G+C content of 

5 VGC2 suggests that VGC2, like VGC1 (40, 41, 42) was acquired 
independently by S. typhimurium via horizontal transmission. The proteins 
encoded by ORFs 12 and 13 show strong similarity to bacterial two 
component regulators (39) and could regulate either ORFs 1-11 and/or the 
secreted proteins of this system. 

10 Many genes in VGC1 have been -shown to be important for entry of S, 
typhimurium into epithelial cells. This process requires bacterial contact (2) 
and results in cytoskeletal rearrangements leading to localised membrane 
ruffling (43, 44). The role of VGC1 and its restriction to this stage of the 
infection is reflected in the approximately 50-fold attenuation of virulence 

15 in BALB/c mice inoculated p.o, with VGC1 mutants and by the fact that 
VGC1 mutants show no loss of virulence when administered Lp. (8). The 
second observation also explains why no VGC1 mutants were obtained in 
our screen (5). In contrast, mutants in VGC2 are profoundly attenuated 
following both p.o- and Lp, inoculation. This shows that, unlike VGC1, 

20 VGC2 is required for virulence in mice after epithelial cell penetration, but 
these findings do not exclude a role for VGC1 in this early stage of 
infection. 

Thus in summary mapping the insertion points of 16 signature-tagged 
25 transposon mutants on the Salmonella typhimurium chromosome led to the 
identification of a 40 kb virulence gene cluster at minute 30,7, This locus 
is conserved among all other Salmonella species examined, but not present 
in a variety of other pathogenic bacteria or in Escherichia coli K12. 
Nucleotide sequencing of a portion of this locus revealed 1 1 open reading 
30 frames whose predicted proteins encode components of a type III secretion 
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system. To distinguish between this and the type III secretion system 
encoded by the inv/spa invasion locus we refer to the inv/spa locus as 
virulence gene cluster 1 (VGC1) and the new locus as VGC2. VGC2 has 
a lower G+C content than that of the Salmonella genome and is flanked by 

5 genes whose products share greater than 90% identity with those of the E. 
coli ydhE and pykF genes. Thus VGC2 was probably acquired horizontally 
by insertion into a region corresponding to that between the ydhE and pykF 
genes of E. coli. Virulence studies of VGC2 mutants have shown them to 
be attenuated by at least five orders of magnitude compared with the wild 

10 type strain following oral or intraperitoneal inoculation. 
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Example 5: Identification of virulence genes in Streptococcus 
pneumoniae 

(a) Mutagenesis 

In the absence of a convenient transposon system, the most efficient way of 
creating tagged mutants of Streptococcus pneumoniae is to use 
insertion-duplication mutagenesis (Morrison et al (1984) /. Bacteriol 159, 
870). Random 5. pneumoniae DNA fragments of 200-400 bp will be 
generated by genomic DNA digestion with a restriction enzyme or by 
physical shearing by sonication followed by gel fractionation and DNA 
end-repair using T4 DNA polymerase. The fragments are ligated into 
plasmid pJDC9 (Pearce et al (1993) Mol Microbiol 9, 1037 which carries 
the erm gene for erythromycin selection in E. coli and S. pneumoniae), 
previously modified by incorporation of DNA sequence tags into one of the 
polylinker cloning sites. The size of cloned S. pneumoniae DNA is 
sufficient to ensure homologous recombination, and reduces the possibility 
of generating an unrepresentative library in E. coli (expression of S. 
pneumoniae proteins can be toxic to E. coli). Alternative vectors carrying 
different selectable markers are available and can be used in place of 
pJDC9. Tagged plasmids carrying DNA fragments are introduced to an 
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appropriate S. pneumoniae strain selected on the basis of serotype and 
virulence in a murine model of pneumococcal pneumonia. Regulation of 
competence for genetic transformation in S. pneumoniae is governed by 
competence factor, a peptide of 17 amino acids which has been 

5 characterized recently by Don Morrison's group at the University of Illinois 
at Chicago and which is described Havarstein, Coomaraswamy and 
Morrison (1995) Proa Natl Acad. ScL USA 92, 11140-11144. 
Incorporation of minute quantities of this peptide in transformation 
experiments leads to very efficient transformation frequencies in some 

10 encapsulated clinical isolates of S.- pneumoniae. This overcomes a major 
hurdle in pneumococcal molecular genetics and the availability of the 
peptide greatly facilitates the construction of S. 

pneumoniae mutant banks and allows flexibility in choosing the strain(s) to 
be mutated. A proportion of trans formants are analysed to verify 

15 homologous integration of the plasmid sequences, and checked for stability. 
The very low level of reversion associated with mutants generated by 
insertion-duplication is minimized by the fact that the duplicated regions will 
be short (200-400 bp); however if the level of reversion is unacceptabiy 
high, antibiotic selection is maintained during growth of the transformants 

20 in culture and during growth in the animal. 

(b) Animal model 

The 5. pneumoniae mutant bank is organized into pools for inoculation into 
Swiss and/or C57B1/6 mice. Preliminary experiments are conducted to 

25 determine the optimum complexity of the pools and the optimum inoculum 
level. One attractive model utilises inocula of 10 5 cfu, delivered by^ mouth 
to the trachea (Veber etal (1993) 7. Antimicrobial Chemotherapy 32, 473). 
Swiss mice develop acute pneumonia within 3-4 days, and C57B1/6 mice 
develop subacute pneumonia within 8-10 days. These pulmonary models 

30 of infection yield 10 8 cfu/lung (Veber et al (1993) J, Antimicrobial 
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Chemotherapy 32, 473) at the time of death. If required, mice are also 
injected intraperitoneal^ for the identification of genes required for 
bloodstream infection (Sullivan et al (1993) Antimicrobial Agents and 
Chemotherapy 37, 234). 

5 

(c) Virulence gene identification 

Once the parameters of the infection model are optimized, a mutant bank 
consisting of several thousand strains is subjected to virulence tests. 
Mutants with attenuated virulence are identified by hybridisation analysis, 

10 using labelled tags from the 'input' and 'recovered' pools as probes. If S. 
pneumoniae DNA cannot be colony blotted easily, chromosomal DNA is 
liberated chemically or enzymatically in the wells of microtitre dishes prior 
to transfer onto nylon membranes using a dot-blot apparatus. DNA flanking 
the integrated plasmid is cloned by plasmid rescue in E. coli (Morrison et 

15 al (1984) J. BacterioL 159, 870), and sequenced. Genomic DNA libraries 
are constructed in appropriate vectors maintained in either E. coli or a 
Gram-positive host strain, and are probed with restriction fragments 
flanking the integrated plasmid to isolate cloned virulence genes which is 
then fully sequenced and subjected to detailed functional analysis. 

20 

Example 6: Identification of virulence genes in Enterococcus faecalis 
(a) Mutagenesis 

Mutagenesis of E. faecalis is accomplished using plasmid pAT112 or a 
25 derivative, developed for this purpose. pATl 12 carries genes for selection 
in both Gram-negative and Gram-positive bacteria, and the art -site of 
Tnl545. It therefore requires the presence in the host strain of the integrase 
for transposition, and stable, single copy insertions are obtained if the host 
does not contain an excisionase gene (Trieu-Cuot et al (1991) Gene 106, 
30 21). Recovery of DNA flanking the integrated plasmid is accomplished by 
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restriction digestion of genomic DNA, intramolecular ligation and 
transformation of E. coli. The presence of single sites for restriction 
enzymes in pAT112 and its derivatives will (Trieu-Cuot et al (1991) Gene 
106, 21) allows the incorporation of DNA sequence tags prior to transfer 
to a virulent strain of E. faecalis carrying plasmid p AT145 (to provide the 
integrase function) by either conjugation, electroporation or transformation 
(Trieu-Cuot et al (1991) Gene 106, 21 ; Wirth et al (1986) X Bacteriol 165, 
831), 

(b) Animal model 

A large number of insertion mutants are analysed for random integration of 
the plasmid by isolating DNA from transcipients, restriction enzyme 
digestion and Southern hybridisation. Individual mutants are stored in the 
wells of microtitre dishes, and complexity and size of pooled inocula are 
optimised prior to screening of the mutant bank. Two different models of 
infection caused by E. faecalis are employed. The first is a well established 
rat model of endocarditis, involving tail vein injection of up to 10 s cfu of 
E. faecalis into animals that have a catheter inserted across the aortic valve 
(Whitman et al (1993) Antimicrobial Agents and Chemotherapy 37, 1069). 
Animals are sacrificed at various times after inoculation, and bacterial 
vegetations on the aortic valve are excised, homogenized and plated to 
culture medium to recover bacterial colonies. Virulent bacteria are also 
recovered from the blood at various times after inoculation. The second 
model is of peritonitis in mice, following intraperitoneal injection of up to 
10 9 cfu of E. faecalis (Chenoweth et al (1990) Antimicrobial Agents and 
Chemotherapy 34, 1800). As with the S. pneumoniae model, preliminary 
experiments are done to establish the optimum complexity of the pools and 
the optimum inoculum level, prior to screening the mutant 
bank. 
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(c) Virulence gene identification 

Isolation of DNA flanking the site of integration of pATl 12 using its E. coli 
origin of replication is simplified by the lack of sites for most of the 
commonly used 6 bp recognition restriction enzymes in the vector. 
5 Therefore DNA from the strains of interest are digested with one of these 
enzymes, self-ligated, transformed into E. coli and sequenced using primers 
based on the sequences adjacent to the att sites on the plasmid. A genomic 
DNA library of E. faecalis are probed with sequences of interest to identify 
intact copies of virulence genes which are then sequenced. 

10 

Example 7: Identification of virulence genes in Pseudomonas aeruginosa 

(a) Mutagenesis 

15 Since transposon Tn5 has been used by others to mutagenise Pseudomonas 
aeruginosa, and the mini-Tn5 derivative that was used for the identification 
of Salmonella typhimurium virulence genes (Example 1) is reported to have 
broad utilisation among Gram-negative bacteria, including several 
pseudomonads (DeLorenzo and Timaris (1994) Methods EnzymoL 264, 

20 386), a P. aeruginosa mutant bank is constructed using our existing pool of 
signature tagged mini-Tn5 transposons by conjugal transfer of the suicide 
vector to one or more virulent (and possibly mucoid) recipient strains. This 
approach represents a significant time saving. Other derivatives of Tn5 
designed specifically for P. aeruginosa mutagenesis (Rella et al (1985) Gene 

25 33, 293), may alternatively be employed with the mini Tn5 transposon. 

(b) Animal model and virulence gene identification 

The bank of P. aeruginosa insertion mutants is screened for attenuated 
virulence in a chronic pulmonary infection model in rats. Suspensions of 
30 P. aeruginosa cells are introduced into a bronchus following tracheotomy, 
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and disease develops over a 30 day period (Woods et al (1982) Infect, 
Immun. 36, 1223). Bacteria are recovered by plating lung homogenates to 
laboratory medium and sequence tags from these are used to probe DNA 
colony blots of bacteria used as the inoculum. It is also possible to subject 

5 the mutant bank to virulence tests in a model of endogenous bacteremia 
(Hirakata et al (1992) Antimicrobial Agents and Chemotherapy 36, 1198), 
and cystic fibrosis (Davidson et al (1995) Nature Genetics 9, 351) in mice. 
Cloning and sequencing of DNA flanking the transposons is done as 
described in Example 1. Genomic DNA libraries for the isolation and 

10 sequencing of intact copies of the genes are constructed in the laboratory by 
standard methods. 

Example 8: Identification of virulence genes in Aspergillus fumigatus 

15 (a) Mutagenesis 

The functional eqiuvalent of transposon mutagenesis in fungi is restriction 
enzyme mediated integration (REMI) of transforming DNA (Schiestl and 
Petes (1991) Proa Natl Acad. ScL 88, 7585). In this process, fungal cells 
are transformed with DNA fragments carrying a selectable marker in the 

20 presence of a restriction enzyme, and single copy integrations occur at 
different genomic sites, defined by the target sequence of the restriction 
enzyme. REMI has already been used successfully to isolate virulence 
genes of Cochliobolus (Lu et al (1994) Proa Natl Acad. ScL USA 91, 
12649) and Ustilago (Bolker et al (1995) Mol Gen, Genet. 248, 547), and 

25 have shown that incorporation of active restriction enzyme with a plasmid 
encoding hygromycm resistance leads to single and apparently random 
integration of the linear plasmid into the A. fumigatus genome. Sequence 
tags are introduced into a convenient site in one of two vectors for 
hygromycin resistance, and used to transform a clinical isolate of A. 

30 fumigatus. 
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(b) Animal model and virulence gene identification 
The low-dose model of aspergillosis in neutropenic mice in particular 
closely matches the course of pulmonary disease in humans (Smith et al 
(1994) InfecL Immun. 62, 5247). Mice are inoculated intranasally with up 

5 to 1,000,000 conidiospores/mouse, and virulent fungal mutants are 
recovered 7-10 days later by using lung homogenates to inoculate liquid 
medium. Hyphae are collected after a few hours, from which DNA is 
extracted for amplification and labelling of tags to probe colony blots of 
DNA from the pool of transformants comprising the inoculum. DNA from 

10 the regions flanking the REMI insertion points are cloned by digesting the 
transformant DNA with a restriction enzyme that cuts outside the REMI 
vector, self ligation and transformation of E. coli. Primers based on the 
known sequence of the plasmid are used to determine the adjacent A. 
fumigatus DNA sequences. To prove that the insertion of the vector was 

15 the cause of the avirulent phenotype, the recovered plasmid is recut with the 
same restriction enzyme used for cloning, and transformed back into the 
wild-type A, fumigatus parent strain. Transformants that have arisen by 
homologous recombination are then subjected to virulence tests. 



20 
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1. A method for identifying a microorganism having a reduced 
adaptation to a particular environment comprising the steps of: 

(1) providing a plurality of microorganisms each of which is 
independently mutated by the insertional inactivation of a gene with a 
nucleic acid comprising a unique marker sequence so that each mutant 
contains a different marker sequence, or clones of the said microorganism; 

(2) providing individually a stored sample of each mutant 
produced by step (1) and providing individually stored nucleic acid 
comprising the unique marker sequence from each individual mutant; 

(3) introducing a plurality of mutants produced by step (1) into the 
said particular environment and allowing those microorganisms which are 
able to do so to grow in the said environment; 

(4) retrieving microorganisms from the said environment or a 
selected part thereof and isolating the nucleic acid from the retrieved 
microorganis ms ; 

(5) comparing any marker sequences in the nucleic acid isolated 
in step (4) to the unique marker sequence of each individual mutant stored 
as in step (2); and 

(6) selecting an individual mutant which does not contain any of 
the marker sequences as isolated in step (4). 

2. A method according to Claim 1 wherein the plurality of 
microorganisms as defined in step (1) is produced from a plurality of 
microorganisms, each of which comprises a nucleic acid comprising a 
unique marker sequence, by changing their condition from a first given 
condition to a second given condition wherein (a) in the first given condition 
the said nucleic acid comprising a unique marker is maintained episomally 
and (b) in the second given condition the said nucleic acid comprising a 
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unique marker sequence insertionally inactivates a gene 



3. A method according to Claims 1 or 2 further comprising the steps: 
(1A) removing auxotrophs from the plurality of mutants produced 
5 in step (1); or 

(6A) determining whether the mutant selected in step (6) is an 
auxotroph; or 

both (1A) and (6A). 



10 4. A method of identifying a -gene which allows a microorganism to 
adapt to a particular environment, the method comprising the method of any 
one of Claims 1 to 3 followed by the step: 

(7) isolating the insertionally-inactivated gene from the individual 
mutant selected in step (6). 



15 



20 



5. A method according to Claim 4 further comprising the step: 

(8) isolating from a wild-type microorganism the corresponding 
wild-type gene using the insertionally-inactivated gene isolated in step (7) 
as a probe. 

6. A method according to any one of Claims 1 to 5 wherein the 
particular environment is a differentiated multicellular organism. 



7. A method according to Claim 6 wherein the multicellular organism 
25 is a plant. 

8. A method according to Claim 6 wherein the multicellular organism 
is a non-human animal. 

30 9. A method according to Claim 8 wherein the animal is a mouse, rat, 



rabbit, dog or monkey. 
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10. A method according to Claim 9 wherein the animal is a mouse. 

5 11. A method according to any one of Claims 6 to 10 wherein in step (4) 
the microorganisms are retrieved from the said environment at a site remote 
from the site of introduction in step (3). 

12. A method according to any one of Claims 8 to 10 wherein in step (3) 
10 the microorganism is introduced orally or intraperitoneally. 

13. A method according to Claim 12 when dependent on Claims 8 or 9 
wherein in step (4) the microorganisms are retrieved from the spleen. 

15 14. A method according to any one of the preceding claims wherein the 
microorganism is a bacterium. 

15. A method according to any one of Claims 1 to 13 wherein the 
microorganism is a fungus. 

20 

16. A method according to Claim 7 wherein the microorganism is a 
bacterium pathogenic to plants. 

17. A method according to Claim 7 wherein the microorganism is a 
25 fungus pathogenic to plants. 

18. A method according to any one of Claims 8 to 10 wherein the 
microorganism is a bacterium pathogenic to animals. 

30 19. A method according to any one of Claims 8 to 10 wherein the 



87 

microorganism is a fungus pathogenic to animals. 

20. A method according to Claim 18 wherein the bacterium is any one 
of Bordetella pertussis, Campylobacter jejuni, Clostridium botulinum, 

5 Escherichia coli, Haemophilus ducreyi, Haemophilus influenzae, 
Helicobacter pylori, Klebsiella pneumoniae, Legionella pneumophila, 
Listeria spp., Neisseria gonorrhoeae, Neisseria meningitidis, Pseudomonas 
spp., Salmonella spp., Shigella spp., Staphylococcus aureus, Streptococcus 
pyogenes, Streptococcus pneumoniae, Vibrio spp,, and Yersinia pestis. 

10 

21. A method according to Claim 19 wherein the flingus is any one of 
Aspergillus spp., Cryptococcus neoformans and Histoplasma capsulatum. 

22. A method according to any one of the preceding claims wherein in 
15 step (1) the gene is insertionally inactivated using a transposon or 

transposon like element or other DNA sequence carrying a unique marker 
sequence. 

23. A method according to any one of the preceding claims wherein in 
20 step (1) each different marker sequence is flanked on either side by 

sequences common to each said nucleic acid, 

24. A method according to Claim 23 wherein in step (2) the nucleic acid 
comprising the unique marker is isolated using DNA amplification 

25 techniques and oligonucleotide primers which hybridise to the said common 
sequences. 

25. A method according to Claim 23 or 24 wherein in step (4) the 
nucleic acid comprising a plurality of said marker sequences is isolated 

30 using DNA amplification techniques and oligonucleotide primers which 
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hybridise to the said common sequences. 

26. A microorganism obtained using the method of any one of the 
preceding claims, 

27. A microorganism comprising a mutation in a gene identified using the 
method of Claim 5, 

28. A microorganism obtained according to Claim 26, when dependent 
on Claim 8, or Claim 27 for use in a vaccine. 

29. A vaccine comprising a microorganism according to Claim 26, when 
dependent on Claim 8, or Claim 27 and a pharmaceutically-acceptable 

carrier. 

30. A gene obtained using the method of Claims 4 or 5. 

31. A gene according to Claim 30 which is isolated from the Salmonella 
typhimurium genome and hybridises to the sequence shown in Figure 5 
under stringent conditions. 

32. A gene according to Claim 30 which is isolated from the Salmonella 
typhimurium genome and hybridises to a sequence shown in Figure 6 under 
stringent conditions. 

33. A polypeptide encoded by a gene according to any one of Claims 30 
to 32. 

34. A method of identifying a compound which reduces the ability of a 
microorganism to adapt to a particular environment comprising the step of 



89 

selecting a compound which interferes with the function of a gene according 
to any one of Claims 30 to 32 or a polypeptide according to Claim 33. 

35. A compound identifiable by the method of Claim 34. 

36. A compound according to Claim 35 wherein the particular 
environment is a host organism. 

37. A compound according to Claim 36 wherein the host organism is a 
plant. 

38. A compound according to Claim 36 wherein the host organism is an 
animal. 

39. Use of a compound according to any one of Claim 36 to Claim 38 
for treating infection of said host organism with said microorganism. 

40. A molecule which selectively interacts with, and substantially inhibits 
the function of, a gene according to any one of Claims 30 to 32 or a nucleic 
acid product thereof. 

41. A molecule according to Claim 40 which is an antisense nucleic acid 
or nucleic acid derivative. 

42. A molecule according to Claim 40 or 41 which is an antisense 
oligonucleotide. 

43. A molecule according to any one of Claims 40 to 42 for use in 
medicine. 
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44. A method of treating a host which has, or is susceptible to, an 
infection with a microorganism, the method comprising administering an 
effective amount of a molecule or compound according to Claim 36 or 40 
wherein said gene is present in said microorganism, or a close relative of 

5 said microorganism. 

45. A pharmaceutical composition comprising a molecule or compound 
according to Claim 38 or 40 and a pharmaceutically acceptable carrier. 

10 46. The VGC2 DNA of Salmonella typhimurium or a part thereof, or a 
variant of said DNA or a variant of a part thereof. 

47. A mutant bacterium wherein if the bacterium normally contains a 
gene that is the same as or equivalent to a gene in VGC2, said gene is 

15 mutated or absent in said mutant bacterium. 

48. A method of making a bacterium according to Claim 47. 

49. Use of a mutant bacterium according to Claim 47 in a vaccine. 

20 

50. A pharmaceutical composition comprising a bacterium according to 
Claim 47 and a pharmaceutically acceptable carrier. 

51. A polypeptide encoded by VGC2 DNA of Salmonella typhimurium 
25 or a part thereof, or a variant of said polypeptide or a variant of a part 

thereof. 



30 



52. A method of identifying a compound which reduces the ability of a 
bacterium to infect or cause disease in a host comprising the step of 
selecting a compound which interferes with the function of a gene in VGC2 
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according to Claim 46 or a polypeptide according to Claim 5L 

53. A compound identifiable by the method of Claim 52. 

54. A molecule which selectively interacts with, and substantially inhibits 
the function of, a gene in VGC2 of Salmonella typhimurium or a nucleic 
product thereof. 

55. A molecule or compound according to Claim 53 or 54 for use in 
medicine. 

56. Any novel feature or combination of features disclosed herein. 



ABSTRACT 
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IDENTIFICATION OF GENES 

5 A method for identifying a microorganism having a reduced adaptation to 
a particular environment comprising the steps of; 

(1) providing a plurality of microorganisms each of which is 
independently mutated by the insertional inactivation of a gene with a 
nucleic acid comprising a unique marker sequence so that each mutant 

10 contains a different marker sequence, or clones of the said microorganism; 

(2) providing individually a stored sample of each mutant 
produced by step (1) and providing individually stored nucleic acid 
comprising the unique marker sequence from each individual mutant; 

(3) introducing a plurality of mutants produced by step (1) into the 
15 said particular environment and allowing those microorganisms which are 

able to do so to grow in the said environment; 

(4) retrieving microorganisms from the said environment or a 
selected part thereof and isolating the nucleic acid from the retrieved 
microorganisms; 

20 (5) comparing any marker sequences in the nucleic acid isolated 

in step (4) to the unique marker sequence of each individual mutant stored 
as in step (2); and 

(6) selecting an individual mutant which does not contain any of 
the marker sequences as isolated in step (4). 

25 
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A) new virulence factors with similarity to 
sequenced genes : 

1. plFlO 

similarity to clpP (E. coli) 
(Figure 5) 

2. p2D6 

similarity to lcrD (Yersinia spp.) 
sequence p2D6_l_I 

GGTCTTAATGTACGGGCATGGTCTGCATCGATAACTCCGGCACGCAAATCG 
CCATCGATACTCATTTGTTTGGCTGGCATCCCATCAAGCGAGAAACGTGCG 
CTAACTTCCGCCACCCTCTCGATACCTTTTGTAATGACAATAAATTGCACG 
ATAGTAATGATGGTAAATACGACCAACCCAACGGTGAGATTTCCTCCTACG 
ACAAACTTACCGAAAGCATCCACAAATATTACCGGCATTATGTTGTAACAG 
TACCCAGCCGTGATGTGCTGATTGGGGAGTTAACAACCGATTTAT 

3 - S4C3 

probably same gene as p2D6, but different region 
similarity to S. typhimuri urn invA and Yersinia 
spp.lcrD 

sequence s4C3_l_U 

GCGCGGACGCTAGTGTGGTGGGTGACAGCCAGACGTTACCGAACGGGATGG 
GGCAGATCTGTTGGCTTACAAAAGACATGGCCCATAAGGCGCAAGGTTTTG 
GGACTGGACGTTTTCGCGGGCAGACAACGTATCTCTGTCTTATTAAAATGT 
GTCCTGCTTCGGCATATGTATCGAACCCTCGGAGCAAAGTCGTTTGGGCGC 
AGAATTAGTACGTTTGGGTCGGTTGCTGTTATTCCTTGGGCTCGGAAAAAG 
AGTGC CAGCGTG AAGGAGTGGGATTTGGC AGACTGGC CGC CTAAT 

sequence s4C3_l_R 

CACTATAGGGAAAGCTTGCATGCCTGCAGGTCGACTCTAGAGGATCTACTA 
GTCATATGGATTGCACTTGTGTATAAGAGTCAGGATTAGAGGAGATGCGCC 
GGGAACCATACTATCTTTTTCCGGTGCTTCGACGCCATTTGCGGAAACCAC 
AGACTTTTTGCGGCGAATGAGGATAATTGGCAATGCTAACAACGCTGAAAA 
GAAAGCGAGAGTGATAAAAGGAAAGCCAGGAATTAAAGCGAGGAGCATTAA 
AACCACAGCGGCTAATATGAGCGACTGAGGTTGTCTGGCAATTTG 
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4 . p3F4 

similarity to invG (S. typhimurium) 
sequence p3F4_l_U 

TGCAGGCCGACTCTAGAGGATCCCCGGGTACCGGTAATTTCTTTAACCTCG 
CATCCCGGTGGATGAAAGGATATTCTGGCTGCGTAAGTAATGAATGAACCG 
CCCAGTAGATAAAATATTGAAAGTGATAACCTGATGTTTTAATAACGATGC 
AGGATATACATATAACATGCTGGCATCAAACCAGGTAAGCAAATCATATTG 
TGCTGCCAGGTTATTCAAACTATCGACCGGTGGTCCAGGCGGGAATTTTTC 
CACTAAATGTAGGTGGGATCAATGGGCTAATTGGTATAGGCGGAT 

5. p7G2 

similarity to yscC (Yersinia spp.) 
sequence p7G2_l_U 

CCTGTGATTCCGGATGAAATAGCTTTTACGAAAGCTGTCAGACNTGCTGAA 
GAATACGCTGCAAATGGTAAGCTTGTAACTTTTGGGTATTGTTCCAACGCA 
TGCTGAAACGGGTTATGGATATATTCGTCGCGGTGAGTTGATAGGAAATGA 
CGCTTATGCAGTGGCTGAATTTGTGGAGAAACCGGATATCGATACCGCCCG 
TGACTATTTCAAATCAGGGGAAATATTACTGGCCTAGCGGCGATGTTTTTA 
TTTCGCGCAAAGCCCTTATTTAAACGAATTAAACGTATCTATCACCCCCAA 
ATTCATACAGCTTGTGAA 

sequence p7G2_3_0 

TTACTAAACAGGGCC C CGGACCATGTAAACACC ACGCTTG C CAACACTAAA 
AAACGATGCTTGCCGTAAAAAAATTGAACGTTATTTACTTAATACGCCTAT 
TTTATTTACATTATGCACGGACAGAGGGTGAGGATTAAATGGATAATATTG 
ATAATAAGTATACTCCACAGCTATGTAAAATTTTGGGGGCTATATCGGATT 
TGGTTGTTTTTAATTTAGCCTTATGGCTTTCACTAGGATGTGTCTATTTTT 
TTTGTGGTCAAGCACAGAGATTTATTCCCCAACCACC 

sequence p7G2_l_I 

TTTCCTTGCCGTGACAGTCCGGGATGCGAGGTTAACGAAATTACCGGCACC 
AAAGCTGTGGAGGTGAGCGGTGTCCCCAGCTGCCTGACTCGTATTAGTCAA 
TTAGCTTCAGTGCTGGATAATGCGTTAATCAAACGAAAAGACAGTGCGGTG 
AGTGTAAGTATATACACGCTTAAGTATGCCACTGCGATGGATACCCAGTAC 
CATTATCGCGATCAGTCCGTCGTGGTTCCAGGGGTCGCCTAGTGTATTGCG 
TGAGATGAGTAACACCAGCGTCCCGACGTCATCGACGAACAATGG 
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6. p9B7 

similarity to fliQ, invX (E. coli) 
sequence p9B7_l_I 

CATGAGTAACCTACCCAACTGTAATCTTTACCAATATGCATCATAATCTTC 
TGCTGGTAAATGATTGGTAATATCGGAAAGGTAAGTGACATAAGCACGCCA 
TTACGTAAAAGTGCGGCCCCTAAACTGCCACTTTTTAATAAGGGAAGTAAT 
AAAGAAAGGCTCAATGGTCGAATAAAAGCCACAGCCAATGCAATAAGCCAC 
TCATTTACCTGTTGTGCCATTCAACCATGCTCTCCAATTCGTAACATTATC 
TGCCGGGTATAATTCAACAGGATACCGCTAAGCCATGGGTAG 

sequence p9B7_3_0 

ATTCCAGCCCCCGGGCCATCTAACCACTATGAACAATCATCTTCTGGGTGG 
ACAATCATTGGTACCATCGGCCAGGCTTGTGCAATATGTATGTCATCACGT 
AAAAGCGCGGCCCCTTAATCTCCCCATTCTTCCTTAAGGGCAGTTATCACG 
GCTGGCTCAATGGCCGGCTTAACAGCCACAG 

1. S6F5 

similarity to yscU (Y. enterocolitica) 
sequence s6F5_l_0 

GAGGCGCGTCTTCGGTTGAGGGTCGCCCTCCAGATCTTTATGCTCCTGTTT 
TACGTCATCTTTACTCATTTTAAGATCTTTTCTAATCTTATAATATTGAAA 
AGAATAGTCCAGTATGCCAACGACGAAATAAAGAAACATCACCCCAACCCA 
TAACCATTTTTTCAATGATGAAAGCACAAGCACGCCACAGGCTACACCACA 
GCCCGGAGGGGGCCGGAAAGTGCTGGGATCTTGATTAATGAAAAAGGCAAA 
GGGAAGAGATAGGATGATGCATGCTGGTTGGAGGCAGATTATTCATCTTCG 
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B) new sequences without similarity to entries in 
DNA or protein database: 

1. S4D10 

sequence s4D10_l_U 

AGTTGCCGTATTTATTAAATATTCACCTCAGGTCAATATGGAGGTCTTCCC 
GGCTAAAAATCATTGCTTTACTAGAGATATCACTCCCTGGGTTGCAATACA 
GTACGATTAGTTATCTTGATGCAGCCTGCTGATTTCAGAATGGCAGCTGAC 
GTACCCGCGAGACAAACATTCTGGATTATGGACGTTATCAACGCCAATATA 
GGGAAGGTGGTGAAGTGGTTGATGAAATACCCCTATCCCTTGCATGTTATC 
GCTGACAGGACTGTTATCAGGAGCGGGCATCCTCGATCGGCT 

sequence s4D10_l_R 

CAAGAGACAGATCCAACTCGGGCCGATCGCCATAACGCCAGCAGTTTGAAA 
GATGAAAGCCCAGCTTATCCAGCCATTCCGGTACAGCGTAACGAGCAGGTT 
GCCAGAAATAACGATAAAGTTGCAACACCTCGGGATCAGGTCGGCTCAAAA 
ACGGGGTCTCAGGCAAAAATAGCCGATCAGGATGCCCACTCCTAATAACAG 
TCCTGTCAACGATAACATCAACGGATAAGGGTATTTCATCAACCACTTCAC 
CACCTTCCCTTTATTGGCGTTGGATAACGTCCATAATCCAGA 

2. S4H10 

sequence s4H10_l_U 

AGGGCTTTATTGATT C CATTTTTACACTGATGAATGTTC CGTTG CG CTGCC 
CGGATTACAGCCGGATCCTCTAGAGTCGACCTGCAGAACCGAGCCAGGAGC 
AAATTAATTTTTTTGGGCAATTGCTGAAAGATGAAGCATCCACCAGTAACG 
CCAGTGCTTTATTACCGCAGGTTATGTTGACCAGACAAATAGATTATATGC 
AGTTAACGGTAGGCGTCGATTATCTTGTCAGAATATCAGGCGCAGCATCGC 
AAGCGCTTAATAAGCTGGGTAACATGGCATGAAGGGGCAACCC 

sequence s4H10_l_R 

CACTATAGGGAAAGCTTGCATGCCTGCAGGTCGACTCTAGAGGATCTACTA 
GTCATATGGATTCCTAGGCGGCCAGATCTGATCAAGAGACAGATCCAACTC 
GGGCCGATCGCCATAACGCCAGCAGTTTGAAAGATGAAAGCCCAGCTTATC 
CAGCCATTCCGGTACAGCGTAACGAGCAGGTTGCCAGAAATAACGATAAAG 
TTGCAACACCTCGGGATCAGGTCGGCTCAAAAACGGGGTCTCAGGCAAAAA 
TAGCCGATCAGGATGCCCACTCCTAATAACAGTCCTGTCAACG 
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3 . p4G5 

sequence p4G5_l_0 

CCCCCCCCCTTCTCCTGGCTTACACAGCCCCAGACCGGCGCTGGAAAAGGC 
CATTCCCGCCATACAGGAGGCCAGCAACATATTTTCACGCGCCGCCAGATC 
GTGGCCGTAACCCACGGCTTTCGGCAGCGATTTGCCAATCATCGCTATCGC 
GCCAATCGCCAGGCTGTCGGTAAACGGCGTGGCGTTGAGCGCGCTGTAGGC 
CTCAATCGCATGCGTCAACGCATCGATACCGGTCATCGCCGTCACGTTTGG 
CGGAACGCCTTCGGTCACGGAAGCATCAAGAATCGCCACGTCCGGC 

sequence p4G5_l_U 

CGCGAACGTGCGCCGCAACTGCTTGTGGACGGTGAATTGCAGTTTGACGCC 
GCTTTCGTGCCGGAGGTCGCCGCGCAAAAAGCGCCTGACAGCCCGCTGCAA 
GGCCGCGCCAACGTGATGATTTTCCCGTCGCTGGAGGCGGGCAATATTGGC 
TACAAAATCACTCAGCGTCTGGGAGGCTATCGCGCTGTTGGGCCGCTAATT 
CAGGGGCTTGGCGCGCCGCTTCACGACCTCTCCCGAGGCTGTAGCGTGCAG 
GAAATTATCGAACTGCGGTTGGTGAGAAAACCAA 

4 . p7A3 

sequence p7A3_l_U 

CGCCCTAGCATGCCTGGCGTTGTCCGGTTATTGCTCGTCAAGCGAACAGAT 
GCAAAAGGTGAGAGCGACTCTCGAATCATGGGGGGTCATGTATCGGGATGG 
TGTAATCTGTGATGACTTATTGGTACGAGAAGTGCAGGATGTTTTGGATAA 
AAATGGGTTACCCGCATGCTGAAGTATCCAGCGAAGGGCCGGGGAGCGTGT 
TAATTCATGATGATATACAAATGGATCAGCAATGGCGCAAGGTTCAACCAT 
TACTTGCAGATATTCCCGGGTTATTGCACTGGCAGATTAGTCACTCTC 

sequence p7A3_l_I 

CCCTTCCCAGGCTCGACAGGTACACAGCCAGCCACTGGTGCAGGCAGTTAC 
TTGCTTTCATCATGGGAAGGAGCAATATCCTGATATATTAAAGAAAGAGCG 
GGATCCCCTTTCTTTACTGCTGCTAACGTTTCTTGCAAAATGCGTTGATGA 
GATTCATCCAGCACACCACTGATAAGAAAAGAGCGCCGCATTGGCGTAACA 
TTGACAAGCCCCACTAAACCGCTCTCTATTATCGCAGAAATAATATCATCC 
CCCTGAGACTGATGAGAGTGACTATTCTGCCAGCGCAAATAACCC 

5. plOEll 
sequence plOEll_l 

ATACCGAGTATTAAGCGGCTGTGTAACATCGTCATCCAACAACATACGCAG 
CGAGC CGC CACGC CGGAAAAAC CGC ATCGTGTC ATGTG CCTGTTGTAGGGT 
CGGGTCTTTTTTCATGAGTACGTTTTCTGCGCTATCATACTGGAAATTTCC 
CCCCACTTACTGATAAGCCCTGTCAGTTGGGTAAGGACAGAGTTAAGCTCC 
TGAGACATTTTTTGGAATGGTTATCTTTCCCCGACTCATAAAATCGGTATT 
CCCGCTGGGGGCAATATCCAAAGACGCTTTGGTCGCCCGTAGGGCACC 

Figure 6E 



sequence plOEll_U 

GCCGTATGCCTGCAGTTGCCCGGTTATTGCTCGTCAAGCGAACCGATGCCA 
AAGGTGAGAGCGACTCTCGAATCATGGGGGGTCATGTATCGGGATGGTGTA 
ATCTGTGATGACTTATTGGTACGAGAAGTGCAGGATGTTTTGGTAAAAATG 
GGTTACCCCCATGCTGAAGTATCCAGCGAAGGGGCGGGGAGCGTGTTAATT 
CACGATGATATTCAAATGGGTCAGCAATGGGGCAAGGTTCAACCCCCACTT 
GCAGATATTCCCCCCCCTATTGGACTGGCAGATTAGTCACTCTCA 

6. S4B9 

sequence s4B9_l_0 

GGGCGACCTGCCCGCGGCGCAACTTTCCCCGAAGCGTTTTCCATTTCCTTG 
TTCTTAAATGACCTGGAAAGCTTACCTAAGCCTTGTCTTGCCTATGTGACA 
ATACTGCTTGGAGAACACCCGGACGTCCATGATTATGCTATACAGATCACA 
GCGGATGGGGGATGGTGAATCGGTTATTATACCACAAGTCGCAGCTCTGAG 
CTTATTGCTATTGAGATAGAAAAACACCCCGCTTCAACTTGGATTTTGAAT 
AATGTAATACGCAATCACCATACACTATATTCGGGTGGCGTATAA 

sequence s4B9_l_R 

TTCGAGCTGGGGCACCGCTAATATCTTTAACCTCGCATCCCGGTGATGAAA 
GGATATTCTGGCTGCGTAAGTAATGAATGAACCGCCCAGCAGATAAAATAT 
TGACAGTGATAACCCGATGTTTTTTTAACGATGCAGGCTATACATATAACA 
TAGCTGGCCACCAACACAGCTGAAGTAAATCATATTGTTGCTGCCAGGGTA 
CTTCACACTATTGTCCGGCGGGCCAGCGGGGATTTTCCCCCTAAATCTCGC 
TGGTTCTCAAA 

7. p4F8 

sequence p4F8_l_I 

AGTCTACGATTTCGCTATATCTTCTCTTAATCATGGCCGCCATTTGTGGAT 
GCGATTTTAAAATATCCGGGCGATCTTTCATTAAAAAATAAAGATTCCCCA 
TGACTTCACAGATAAAGGTATCGGTATTTTGAGTGATACGTAACAATTCGT 
TCTCTTCGTGTGGGTCCATGATGCGAAGAATAATGGTGGCATCATTTTCAT 
GAGGATTATGAACCCGAAATCTTTCTCTTTGCGATGCGCAGGCTAACTCTT 
TCAACTCAAAAAAAATCTCTGTAAGCCGCTCTCGTGTGGGGGCGC 

8. p7B8 

sequence p7B8_l_0 

GCGCCCCTTTAATTGGTTGAGGCGGCTGGTATTCTTGTAAGGGTAATACTA 
GCGAGACCCAGGTTCCACCCCCGGGGACACTTTTTAGTGTCAGATTACCGC 
CC ATC ATTTT AGC CAGGCTTGACGC AATAGT CAGTC C AATTC CTGTAC CTT 
GCGAATTTGTGTCTGCTTGATAAAAAGCAGAAAAGATTTGAGACTGCTGCT 
GTTTTTCAATCC C CCC AC CGCTATCG CTAAC C AGAAATATTAATTGTTC CT 
CACCAAGATTGAGCGCCAGACGTATCCCTCCCCCCTCGGGAAAT 

Figure 6F 



9. p8G12 

sequence p8G12_l_I 

GGATAAGATCCCGGATAAGTATGTCAGGCTCGTATGCACAACAGGCATTAT 
AAACCTCTAGACCATTTTTAACATGCTCTACTATTTTAAAATGAGGCCAGG 
GTAATAAGGCATTCATAATGCCGTTAATGATGATTTCATGATCGTCTACTA 
ATAAGATCTTATATTCTTTCATTTGGCTGCCCTCGCGAAAATTAAGATAAT 
ATTAAGTAATGGTGTAGGTTGTGGAGATCATACGTATTTTCTGGCGTAAGT 
CGGTTAGTTCCTCCAGCGCGATGATTTTCCCCATTTTTACGCGAT 

10. p9G4 

sequence p9G4_l_0 

TTC CATATTGCT CGT CCGGGGAGCGTGTTAATTCTTGATGATATACCAATG 
GATCTGCAATGGCGCAAGGTTCAACCATTACTTGGAGATATTCCCGGGTTA 
TTGTACTGGGAGATTAGTCACTCTCATCAGTCTCAGGGGGGTGATGTTATT 
TCTGGGATAATAGAGCAACGGCGTTAGCAGGGGTCGGTCAGTAGTCACGGC 
CAACTTCGGTGCACTTTTGCGTATCACTGGGGTATCATAACTGAATCTCAT 
CCCCCCCACTTTGGTAATCACAC 

sequence p9G4_l_U 

AATTCTTTTACCTCCATAAGCTGCGTGGCATAGCGATACAGAGTATTAAGC 
GGGTGTGTTACATCGTCATCCAACAACATACGCAGCGAGCCGCCACGCCGG 
AAAAACCGCATCGTGTCATGTGCCTGTTGTAGGGTCGGGTCTTTTTTTCAT 
GAGTACGTGTTCTGCGCTATCATACTGGAAATTTCCCCCCACTTACTGATA 
AGCCCTGTCAGTTGGGTAAGGACAGCGTTAAGCTCCTGAGACATTTTTTGA 
GTTGTTATCTGCCCCCCGACTCATAAGATCGGGTATTCCGCGGTGG 

11. p9B6 
sequence p9B6_l 

ATATCCCTAATGCTTTTCCTTAAAATAAATACCACGGAAGGATACTGGCCA 
CCTAGCCAAATTTAGAAAGCAATGAACATCCGGTTTATTCCTGAAAACGAT 
TACTCCGGCGCACGTTGTTCTGGCGTTACCTGAGCCAGCAAACGATATAAT 
GGGGTGGTGACCCGCATACCGGTCATTGGCATCCCATCCACACCGGAGGGA 
GTAAAACTCATTAGGCCATAGGTAATATCATTAAGACGCTCTAATAAATGA 
GGGTGGGGGGCCCAAACTACCACTCCAGTATGTATTGAGTCA 
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12. p6G5 

sequence p6G5_2_I 



CCCATGGGCGCAATTTGTTGCGCAGCGTTTACCCGACCATCGCGTTTATGA 
G CTGTAATTCATGGGGGGTAAAAACGGGCGTGACGAC C C CAACGGAAGATA 
AGGCCGGGCTTAAACAGGAGATTATTGCTAATGCGCAGCGCAAAGTGTTGC 
TGGCGGACAGCAGTAAGTATGGCGCGCATTCGCTCTTTAATGTGGTGCCGC 
TTGAGCGCTTTAATGACGTGATTACCGACGTCAATCTGCCGCCGTCAGCGC 
AGGTTGAACTGAAAGGGCGCGCTTTTTGCGCTAACG 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Applicants: David William Holden 



Serial No.: Continuation of 09/201,945 



Group Art Unit: Not Yet Assigned 



Filed: 



Examiner: Not Yet Assigned 



For: 



IDENTIFICATION OF GENES 



Assistant Commissioner for Patents 
Washington, D.C. 20231 



Sir: 



ASSOCIATE POWER OF ATTORNEY UNDER 37 C.F.R S 1.34 



Please recognize as Associate Patent Attorneys and Patent Agent in this case: 

Robert A. Hodges Reg. No. 41 ,074 
Kevin W.King Reg. No. 42,737 

Please continue to direct all communication to Patrea L. Pabst at the following address: 

Patrea L. Pabst 

ARNALL GOLDEN & GREGORY, LLP 
2800 One Atlantic Center 
1201 West Peachtree Street 
Atlanta, Georgia 30309-3450 
(404) 873-8794; 873-8795 (fax) 



Date: 



November 16, 2000 




ARNALL GOLDEN & GREGORY, LLP 

2800 One Atlantic Center 

1201 West Peachtree Street 

Atlanta, Georgia 30309-3450 

(404) 873-8794 

(404) 873-8595 (fax) 



Patrea L. Pabst 
Reg. No. 31,284 



RPMS 101 CON (3) 
20001/10 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: David William Holden 
(ii) TITLE OF INVENTION: Identification of Genes 
(iii) NUMBER OF SEQUENCES: 501 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Patrea L. Pabst 

(B) STREET: 2800 One Atlantic Center 

1201 West Peachtree Street 

(C) CITY: Atlanta 

(D) STATE: Georgia 

(E) COUNTRY: USA 

(F) ZIP : 30309-3450 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.3 0 
(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 01-DEC-1998 

(C) CLASSIFICATION: 
(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/637,759 

(B) FILING DATE: 03-MAY-1996 

(C) CLASSIFICATION: 
(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/GB95/02875 

(B) FILING DATE: ll-DEC-1995 

(C) CLASSIFICATION: 
(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Pabst , Patrea L. 

(B) REGISTRATION NUMBER: 31,284 

(C) REFERENCE /DOCKET NUMBER: RPMS 101 CON 2 
(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (404) 873-8794 

(B) TELEFAX: (404) 873-8795 

(2) INFORMATION FOR SEQ ID NO: 1: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Synthetic oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 
CTAGGTACCT ACAACCTCAA GCTTNKNKNK NKNKNKNKNK NKNKNKNKNK NKNKNKNKNK 
NKNKAAGCTT GGTTAGAATG GGTACCATG 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Synthetic oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2: 

TACCTACAAC CTCAAGCT 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Synthetic oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CATGGTACCC ATTCTAAC 

(2) INFORMATION FOR SEQ ID NO : 4: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Synthetic oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TACCCATTCT AACCAAGC 

(2) INFORMATION FOR SEQ ID NO : 5: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Synthetic oligonucleotide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CTAGGTACCT ACAACCTC 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Synthetic oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CCTAGGCGGC CAGATCTGAT 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Synthetic oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GCACTTGTGT ATAAGAGTCA G 

(2) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 00 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGTCTTAATG TACGGGCATG GTCTGCATCG ATAACTCCGG CACGCAAATC GCCATCGATA 

CTCATTTGTT TGGCTGGCAT CCCATCAAGC GAGAAACGTG CGCTAACTTC CGCCACCCTC 
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TCGATACCTT TTGTAATGAC AATAAATTGC ACGATAGTAA TGATGGTAAA TACGACCAAC 



180 



CCAACGGTGA GATTTCCTCC TACGACAAAC TTACCGAAAG CATCCACAAA TATTACCGGC 24 0 

ATTATGTTGT AACAGTACCC AGCCGTGATG TGCTGATTGG GGAGTTAACA ACCGATTTAT 3 00 

(2) INFORMATION FOR SEQ ID NO; 9: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
{iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GCGCGGACGC TAGTGTGGTG GGTGACAGCC AGACGTTACC GAACGGGATG GGGCAGATCT 6 0 

GTTGGCTTAC AAAAGACATG GCCCATAAGG CGCAAGGTTT TGGGACTGGA CGTTTTCGCG 120 

GGCAGACAAC GTATCTCTGT CTTATTAAAA TGTGTCCTGC TTCGGCATAT GTATCGAACC 18 0 

CTCGGAGCAA AGTCGTTTGG GCGCAGAATT AGTACGTTTG GGTCGGTTGC TGTTATTCCT 24 0 

TGGGCTCGGA AAAAGAGTGC CAGCGTGAAG GAGTGGGATT TGGCAGACTG GCCGCCTAAT 3 00 

(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CACTATAGGG AAAGCTTGCA TGCCTGCAGG TCGACTCTAG AGGATCTACT AGTCATATGG 6 0 

ATTGCACTTG TGTATAAGAG TCAGGATTAG AGGACATGCG CCGGGAACCA TACTATCTTT 12 0 

TTCCGGTGCT TCGACGCCAT TTGCGGAAAC CACAGACTTT TTGCGGCGAA TGAGGATAAT 18 0 

TGGCAATGCT AACAACGCTG AAAAGAAAGC GAGAGTGATA AAAGGAAAGC CAGGAATTAA 240 

AGCGAGGAGC ATTAAAACCA CAG CGGCTAA TATGAGCGAC TGAGGTTGTC TGGCAATTTG 300 
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(2) INFORMATION FOR SEQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 00 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

TGCAGGCCGA CTCTAGAGGA TCCCCGGGTA CCGGTAATTT CTTTAACCTC GCATCCCGGT 6 0 

GGATGAAAGG ATATTCTGGC TGCGTAAGTA ATGAATGAAC CGCCCAGTAG ATAAAATATT 120 

GAAAGTGATA ACCTGATGTT TTAATAACGA TGCAGGATAT ACATATAACA TGCTGGCATC 180 

AAACCAGGTA AGCAAATCAT ATTGTGCTGC CAGGTTATTC AAACTATCGA CCGGTGGTCC 24 0 

AGGCGGGAAT TTTTCCACTA AATGTAGGTG GGATCAATGG GCTAATTGGT ATAGGCGGAT 30 0 

(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 324 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CCTGTGATTC CGGATGAAAT AGCTTTTACG AAAGCTGTCA GACNTGCTGA AGAATACGCT 60 

GCAAATGGTA AGCTTGTAAC TTTTGGGTAT TGTTCCAACG CATGCTGAAA CGGGTTATGG 12 0 

ATATATTCGT CGCGGTGAGT TGATAGGAAA TGACGCTTAT GCAGTGGCTG AATTTGTGGA 180 

GAAACCGGAT ATCGATACCG CCCGTGACTA TTTCAAATCA GGGGAAATAT TACTGGCCTA 24 0 

GCGGCGATGT TTTTATTTCG CGCAAAGCCC TTATTTAAAC GAATTAAACG TATCTATCAC 300 



CCCCAAATTC ATACAGCTTG TGAA 

(2) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 292 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TTACTAAACA GGGCCCCGGA CCATGTAAAC ACCACGCTTG CCAACACTAA AAAACGATGC 60 

TTGCCGTAAA AAAATTGAAC GTTATTTACT TAATACGCCT ATTTTATTTA CATTATGCAC 120 

GGACAGAGGG TGAGGATTAA ATGGATAATA TTGATAATAA GTATACTCCA CAGCTATGTA 18 0 

AAATTTTGGG GGCTATATCG GATTTGGTTG TTTTTAATTT AGCCTTATGG CTTTCACTAG 240 

GATGTGTCTA TTTTTTTTGT GGTCAAGCAC AGAGATTTAT TCCCCAACCA CC 292 

(2) INFORMATION FOR SEQ ID NO: 14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS; double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE : NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TTTCCTTGCC GTGACAGTCC GGGATGCGAG GTTAACGAAA TTACCGGCAC CAAAGCTGTG 6 0 

GAGGTGAGCG GTGTCCCCAG CTGCCTGACT CGTATTAGTC AATTAGCTTC AGTGCTGGAT 12 0 

AATGCGTTAA TCAAACGAAA AGACAGTGCG GTGAGTGTAA GTATATACAC GCTTAAGTAT 18 0 

GCCACTGCGA TGGATACCCA GTAC CATTAT CGCGATCAGT CCGTCGTGGT TCCAGGGGTC 24 0 

GCCTAGTGTA TTGCGTGAGA TGAGTAACAC CAGCGTCCCG ACGTCATCGA CGAACAATGG 300 

(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 297 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CATGAGTAAC CTACCCAACT GTAATCTTTA CCAATATGCA TCATAATCTT CTGCTGGTAA 60 

ATGATTGGTA ATATCGGAAA GGTAAGTGAC ATAAGCACGC CATTACGTAA AAGTGCGGCC 12 0 

CCTAAACTGC CACTTTTTAA TAAGGGAAGT AATAAAGAAA GGCTCAATGG TCGAATAAAA 18 0 

GCCACAGCCA ATGCAATAAG C C ACTCATTT ACCTGTTGTG CCATTCAACC ATGCTCTCCA 24 0 

ATTCGTAACA TTATCTGCCG GGTATAATTC AACAGGATAC CGCTAAGCCA TGGGTAG 297 

(2) INFORMATION FOR SEQ ID NO: 16: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 184 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

ATTCCAGCCC CCGGGCCATC TAAC CACTAT GAACAATCAT CTTCTGGGTG GACAATCATT 6 0 

GGTACCATCG GCCAGGCTTG TGCAATATGT ATGTCATCAC GTAAAAGCGC GGCCCCTTAA 120 

TCTCCCCATT CTTCCTTAAG GGCAGTTATC ACGGCTGGCT CAATGGCCGG CTTAACAGCC 18 0 

ACAG 184 

(2) INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 306 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GAGGCGCGTC TTCGGTTGAG GGTCGCCCTC CAGATCTTTA TGCTCCTGTT TTACGTCATC 6 0 

TTTACTCATT TTAAGATCTT TTCTAATCTT ATAATATTGA AAAGAATAGT CCAGTATGCC 120 

AACGACGAAA TAAAGAAACA TCACCCCAAC CCATAACCAT TTTTTCAATG ATGAAAGCAC 18 0 
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AAGCACGCCA CAGGCTACAC CACAGCCCGG AGGGGGCCGG AAAGTGCTGG GATCTTGATT 240 
AATGAAAAAG GCAAAGGGAA GAGATAGGAT GATGCATGCT GGTTGGAGGC AGATTATTCA 300 
TCTTCG 306 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 97 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

AGTTGCCGTA TTTATTAAAT ATTCACCTCA GGTCAATATG GAGGTCTTCC CGGCTAAAAA 60 

TCATTGCTTT ACTAGAGATA TCACTCCCTG GGTTGCAATA CAGTACGATT AGTTATCTTG 120 

ATGCAGCCTG CTGATTTCAG AATGGCAGCT GACGTACCCG CGAGACAAAC ATTCTGGATT 18 0 

ATGGACGTTA TCAACGCCAA TATAGGGAAG GTGGTGAAGT GGTTGATGAA ATACCCCTAT 24 0 

CCCTTGCATG TTATCGCTGA CAGGACTGTT ATCAGGAGCG GGCATCCTCG ATCGGCT 297 

(2) INFORMATION FOR SEQ ID NO: 19: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 97 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

CAAGAGACAG ATCCAACTCG GGCCGATCGC CATAACGCCA GCAGTTTGAA AGATGAAAGC 6 0 

CCAGCTTATC CAGCCATTCC GGTACAGCGT AACGAGCAGG TTGCCAGAAA TAACGATAAA 12 0 

GTTGCAACAC CTCGGGATCA GGTCGGCTCA AAAACGGGGT CTCAGGCAAA AATAGCCGAT 18 0 

CAGGATGCCC ACTCCTAATA ACAGTCCTGT CAACGATAAC ATCAACGGAT AAGGGTATTT 24 0 

CATCAACCAC TTCACCACCT TCCCTTTATT GGCGTTGGAT AACGTC CATA ATCCAGA 297 



(2) INFORMATION FOR SEQ ID NO: 20: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

AGGGCTTTAT TGATTCCATT TTTACACTGA TGAATGTTCC GTTGCGCTGC CCGGATTACA 60 

GCCGGATCCT CTAGAGTCGA CCTGCAGAAC CGAGCCAGGA GCAAATTAAT TTTTTTGGGC 12 0 

AATTGCTGAA AGATGAAGCA TCCACCAGTA ACGCCAGTGC TTTATTACCG CAGGTTATGT 18 0 

TGACCAGACA AATAGATTAT ATGCAGTTAA CGGTAGGCGT CGATTATCTT GTCAGAATAT 24 0 

CAGGCGCAGC ATCGCAAGCG CTTAATAAGC TGGGTAACAT GGCATGAAGG GGCAACCC 298 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

CACTATAGGG AAAGCTTGCA TGCCTGCAGG TCGACTCTAG AGGATCTACT AGTCATATGG 6 0 

ATTCCTAGGC GGCCAGATCT GATCAAGAGA CAGATCCAAC TCGGGCCGAT CGCCATAACG 120 

CCAGCAGTTT GAAAGATGAA AGCCCAGCTT ATCCAGCCAT TCCGGTACAG CGTAACGAGC 18 0 

AGGTTGCCAG AAATAACGAT AAAGTTGCAA CACCTCGGGA TCAGGTCGGC TCAAAAACGG 24 0 

GGTCTCAGGC AAAAATAGCC GATCAGGATG CCCACTCCTA ATAACAGTCC TGTCAACG 298 

(2) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 01 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
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(iii) ANTI-SENSE; NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



CCCCCCCCCT 


TCTCCTGGCT 


TACACAGCCC 


CAGACCGGCG 


CTGGAAAAGG 


CCATTCCCGC 


60 


CATACAGGAG 


GCCAGCAACA 


TATTTTCACG 


CGCCGCCAGA 


TCGTGGCCGT 


AACCCACGGC 


12 0 


TTTCGGCAGC 


GATTTGCCAA 


TCATCGCTAT 


CGCGCCAATC 


GCCAGGCTGT 


CGGTAAACGG 


18 0 


CGTGGCGTTG 


AGCGCGCTGT 


AGGCCTCAAT 


CGCATGCGTC 


AACGCATCGA 


TAC CGGTCAT 


240 


CGCCGTCACG 


TTTGGCGGAA 


CGCCTTCGGT 


CACGGAAGCA 


TCAAGAATCG 


CCACGTCCGG 


300 


C 












301 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 289 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CGCGAACGTG CGCCGCAACT GCTTGTGGAC GGTGAATTGC AGTTTGACGC CGCTTTCGTG 6 0 

CCGGAGGTCG CCGCGCAAAA AGCGCCTGAC AGCCCGCTGC AAGGCCGCGC CAACGTGATG 120 

ATTTTCCCGT CGCTGGAGGC GGGCAATATT GGCTACAAAA TCACTCAGCG TCTGGGAGGC 18 0 

TATCGCGCTG TTGGGCCGCT AATTCAGGGG CTTGGCGCGC CGCTTCACGA CCTCTCCCGA 24 0 

GGCTGTAGCG TGCAGGAAAT TATCGAACTG CGGTTGGTGA GAAAACCAA 289 

(2) INFORMATION FOR SEQ ID NO: 24: 
, (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 03 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

CGCCCTAGCA TGCCTGGCGT TGTCCGGTTA TTGCTCGTCA AGCGAACAGA TGCAAAAGGT 6 0 

GAGAGCGACT CTCGAATCAT GGGGGGTCAT GTATCGGGAT GGTGTAATCT GTGATGACTT 12 0 

ATTGGTACGA GAAGTGCAGG ATGTTTTGGA TAAAAATGGG TTACCCGCAT GCTGAAGTAT 18 0 

CCAGCGAAGG GCCGGGGAGC GTGTTAATTC ATGATGATAT ACAAATGGAT CAGCAATGGC 24 0 

GCAAGGTTCA ACCATTACTT GCAGATATTC CCGGGTTATT GCACTGGCAG ATTAGTCACT 300 

f-irpp 303 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

CCCTTCCCAG GCTCGACAGG TACACAGCCA GCCACTGGTG CAGGCAGTTA CTTGCTTTCA 6 0 

TCATGGGAAG GAGCAATATC CTGATATATT AAAGAAAGAG CGGGATCCCC TTTCTTTACT 120 

GCTGCTAACG TTTCTTGCAA AATGCGTTGA TGAGATTCAT CCAGCACACC ACTGATAACA 180 

AAAGAGCGCC GCATTGGCGT AACATTGACA AGCCCCACTA AACCGCTCTC TATTATCGCA 24 0 

GAAATAATAT CATCCCCCTG AGACTGATGA GAGTGACTAT TCTGCCAGCG CAAATAACCC 300 

(2) INFORMATION FOR SEQ ID NO: 26: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 03 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

ATACCGAGTA TTAAGCGGCT GTGTAACATC GTCATCCAAC AACATACGCA GCGAGCCGCC 6C 



ACGCCGGAAA AACCGCATCG TGTCATGTGC CTGTTGTAGG GTCGGGTCTT TTTTCATGAG 
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TACGTTTTCT GCGCTATCAT ACTGGAAATT TCCCCCCACT TACTGATAAG CCCTGTCAGT 18 0 

TGGGTAAGGA CAGAGTTAAG CTCCTGAGAC ATTTTTTGGA ATGGTTATCT TTCCCCGACT 24 0 

CATAAAATCG GTATTCCCGC TGGGGGCAAT ATCCAAAGAC GCTTTGGTCG CCCGTAGGGC 300 

ACC 303 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

GCCGTATGCC TGCAGTTGCC CGGTTATTGC TCGTCAAGCG AACCGATGCC AAAGGTGAGA 60 

GCGACTCTCG AATCATGGGG GGTCATGTAT CGGGATGGTG TAATCTGTGA TGACTTATTG 12 0 

GTACGAGAAG TGCAGGATGT TTTGGTAAAA ATGGGTTACC CCCATGCTGA AGTATCCAGC 18 0 

GAAGGGGCGG GGAGCGTGTT AATTCACGAT GATATTCAAA TGGGTCAGCA ATGGGGCAAG 24 0 

GTTCAACCCC CACTTGCAGA TATTCCCCCC C CTATTGGAC TGGCAGATTA GTCACTCTCA 300 

(2) INFORMATION FOR SEQ ID NO: 28: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 00 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

GGGCGACCTG CCCGCGGCGC AACTTTCCCC GAAGCGTTTT CCATTTCCTT GTTCTTAAAT 6 0 

G AC CTGGAAA GCTTACCTAA GCCTTGTCTT GCCTATGTGA CAATACTGCT TGGAGAACAC 120 

CCGGACGTCC ATGATTATGC TATACAGATC ACAGCGGATG GGGGATGGTG AATCGGTTAT 18 0 

TATACCACAA GTCGCAGCTC TGAGCTTATT GCTATTGAGA TAGAAAAACA CCCCGCTTCA 24 0 
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ACTTGGATTT TGAATAATGT AATACGCAAT CACCATACAC TATATTCGGG TGGCGTATAA 300 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 266 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

TTCGAGCTGG GGCACCGCTA ATATCTTTAA CCTCGCATCC CGGTGATGAA AGGATATTCT 60 

GGCTGCGTAA GTAATGAATG AACCGCCCAG CAGATAAAAT ATTGACAGTG ATAACCCGAT 12 0 

GTTTTTTTAA CGATGCAGGC TATACATATA ACATAGCTGG CCACCAACAC AGCTGAAGTA 18 0 

AATCATATTG TTGCTGCCAG GCTACTTCAC ACTATTGTCC GGCGGGCCAG CGGGGATTTT 24 0 

CCCCCTAAAT CTCGCTGGTT CTCAAA 266 

(2) INFORMATION FOR SEQ ID NO: 30: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 00 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

AGTCTACGAT TTCGCTATAT CTTCTCTTAA TCATGGCCGC CATTTGTGGA TGCGATTTTA 60 

AAATATCCGG GCGATCTTTC ATTAAAAAAT AAAGATTCCC CATGACTTCA CAGATAAAGG 120 

TATCGGTATT TTGAGTGATA CGTAACAATT CGTTCTCTTC GTGTGGGTCC ATGATGCGAA 180 

GAATAATGGT GGCATCATTT TCATGAGGAT TATGAAC CCG AAATCTTTCT CTTTGCGATG 24 0 

CGCAGGCTAA CTCTTTCAAC TCAAAAAAAA TCTCTGTAAG CCGCTCTCGT GTGGGGGCGC 300 



(2) INFORMATION FOR SEQ ID NO: 31: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 299 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 

virulence gene 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

GCGCCCCTTT AATTGGTTGA GGCGGCTGGT ATTCTTGTAA GGGTAATACT AGCGAGACCC 6 0 

AGGTTCCACC CC CGGGGAC A CTTTTTAGTG TCAGATTACC GCCCATCATT TTAGCCAGGC 120 

TTGACGCAAT AGTCAGTCCA ATTC CTGTAC CTTGCGAATT TGTGTCTGCT TGATAAAAAG 180 

CAGAAAAGAT TTGAGACTGC TGCTGTTTTT CAATCCCCCC ACCGCTATCG CTAAC CAGAA 24 0 

ATATTAATTG TTCCTCACCA AGATTGAGCG CCAGACGTAT CCCTCCCCCC TCGGGAAAT 299 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

GGATAAGATC CCGGATAAGT ATGTCAGGCT CGTATGCACA ACAGGCATTA TAAAC CTCTA 60 

GACCATTTTT AACATGCTCT ACTATTTTAA AATGAGGCCA GGGTAATAAG GCATTCATAA 120 

TGC CGTTAAT GATGATTTCA TGATCGTCTA CTAATAAGAT CTTATATTCT TTCATTTGGC 18 0 

TGCCCTCGCG AAAATTAAGA TAATATTAAG TAATGGTGTA GGTTGTGGAG ATCATACGTA 24 0 

TTTTCTGGCG TAAGTCGGTT AGTTCCTCCA GCGCGATGAT TTTCCCCATT TTTACGCGAT 300 

(2) INFORMATION FOR SEQ ID NO : 33: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 278 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : Partial sequence of Salmonella typhimurium 
virulence gene 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

TTCCATATTG CTCGTCCGGG GAGCGTGTTA ATTCTTGATG ATATACCAAT GGATCTGCAA 60 

TGGCGCAAGG TTCAACCATT ACTTGGAGAT ATTCCCGGGT TATTGTACTG GGAGATTAGT 12 0 

CACTCTCATC AGTCTCAGGG GGGTGATGTT ATTTCTGGGA TAATAGAGCA ACGGCGTTAG 18 0 

CAGGGGTCGG TCAGTAGTCA CGGCCAACTT CGGTGCACTT TTGCGTATCA CTGGGGTATC 24 0 

ATAACTGAAT CTCATCCCCC CCACTTTGGT AATCACAC 278 



(2) INFORMATION FOR SEQ ID NO: 34: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 301 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
{iii) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 



virulence gene 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

AATTCTTTTA CCTCCATAAG CTGCGTGGCA TAGCGATACA GAGTATTAAG CGGGTGTGTT 6 0 

ACATCGTCAT CCAACAACAT ACGCAGCGAG CCGCCACGCC GGAAAAACCG CATCGTGTCA 12 0 

TGTGCCTGTT GTAGGGTCGG GTCTTTTTTT CATGAGTACG TGTTCTGCGC TATCATACTG 180 

GAAATTTCCC CCCACTTACT GATAAGCCCT GTCAGTTGGG TAAGGACAGC GTTAAGCTCC 24 0 

TGAGACATTT TTTGAGTTGT TATCTGCCCC CCGACTCATA AGATCGGGTA TTCCGCGGTG 300 

G 301 



(2) INFORMATION FOR SEQ ID NO: 35: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 97 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

ATATCCCTAA TGCTTTTCCT TAAAATAAAT ACCACGGAAG GATACTGGCC ACCTAGCCAA 60 
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ATTTAGAAAG CAATGAACAT CCGGTTTATT CCTGAAAACG ATTACTCCGG CGCACGTTGT 120 

TCTGGCGTTA CCTGAGCCAG CAAACGATAT AATGGGGTGG TGACCCGCAT AC CGGTCATT 180 

GGCATCCCAT CCACACCGGA GGGAGTAAAA CTCATTAGGC CATAGGTAAT ATCATTAAGA 24 0 

CGCTCTAATA AATGAGGGTG GGGGGCCCAA AGTACCACTC CAGTATGTAT TGAGTCA 297 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 91 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Partial sequence of Salmonella typhimurium 
virulence gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

CCCATGGGCG CAATTTGTTG CGCAGCGTTT ACCCGACCAT CGCGTTTATG AGCTGTAATT 6 0 

CATGGGGGGT AAAAACGGGC GTGACGACCC CAACGGAAGA TAAGGCCGGG CTTAAACAGG 12 0 

AGATTATTGC TAATGCGCAG CGCAAAGTGT TGCTGGCGGA CAGCAGTAAG TATGGCGCGC 18 0 

ATTCGCTCTT TAATGTGGTG CCGCTTGAGC GCTTTAATGA CGTGATTACC GACGTCAATC 24 0 

TGCCGCCGTC AGCGCAGGTT GAACTGAAAG GGCGCGCTTT TTGCGCTAAC G 291 

(2) INFORMATION FOR SEQ ID NO: 37: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13417 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: DNA sequence of VGC II from centre to left 
hand end 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

CTGCAGAACC GAGCCAGGAG CAAATTAATT TTTTTGAACA ATTGCTGAAA GATGAAGCAT 60 

CCACCAGTAA CGCCAGTGCT TTATTACCGC AGGTTATGTT GACCAGACAA ATGGATTATA 120 

TGCAGTTAAC GGTAGGCGTC GATTATCTTG CCAGAATATC ACGGCGCAGC ATGCCAAGCG 18 0 

CTTAATAAGC TGGATAACAT GGCATGAAGG TTCATCGTAT AGTATTTCTT ACTGTCCTTA 240 

CGTTCTTTCT TACGGCATGT GATGTGGATC TTTATCGCTC ATTGCCAGAA GATGAAGCGA 300 
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ATCAAATGCT GGCATTACTT ATGCAGCATC ATATTGATGC GAAAAAAAAC AGGAAGAGGA 36 0 

TGGTGTAACC TTACGTGTCG AGCAGTCGGC AGTTTATTAA TGCGGTTGAG GCTACTTAGA 42 0 

CTTAACGGTT ATCCGCATAG GGCAGTTTAC AACGGCGGAT AAGATGTTTC CGGCTAATCA 48 0 

GTTAGTGGTA TCACCCCAGG AAGAACAGGC AGAAGATTAA TTTTTTAAAA GAACAAAGAA 54 0 

TTGAAGGAAT GCTGAGTCAG ATGGAGGGGC GTGATTAATG GCAAAAGTGA CCATTGCGCT 600 

ACCGACTTAT GATGAGGGAA GTAACGCTTC TCCGAGCTCA GTTGCCGTAT TTATAAAATA 66 0 

TTCACCTCAG GTCAATATGG AGGCCTTTCG GGTAAAAATT AAAGATTTAA TAGAGATGTC 720 

AATC CCTGGG TTGCAATACA GTAAGATTAG TATCTTGATG CAGCCTGCTG AATTCAGAAT 78 0 

GGTAGCTGAC GTACCCGCGA GACAAACATT CTGGATTATG GACGTTATCA ACGCCAATAA 84 0 

AGGGAAGGTG GTGAAGTGGT TGATGAAATA CCCTTATCCG TTGATGTTAT CGTTGACAGG 90 0 

ACTGTTATTA GGAGTGGGCA TCCTGATCGG CTATTTTTGC CTGAGACGCC GTTTTTGAGC 96 0 

CGACCTGATC CCGAGGTGTT GCAACTTTAT CGTTATTTCT GGCAACCTGC TCGTTACGCT 1020 

GTACCGGAAT GGCTGGATAA GCTGGGCTTT CATCTTCAAA CTGCTGGCGT TATGGCGATC 108 0 

GGCCCGAGTT GGATCGTCTT CTTGACAGAG CGTTAAATAG ACTAAGAGGA AGCTCTGTTA 114 0 

TTCCAGCCTG TTTAAATGAC AGGCAAAAAC GGCAGGTTCG TCTTGCGCCG CGTATATCGG 12 00 

CATTTGCCTT TGGGCTGGGA TTATTCAAAC TCAGGTGTAG TGACTATTTT ATGCTACCAG 126 0 

AGTATCGGCA ATTGCTTCTA CAGTGGTTTA GCGAGGATGA GATCTGGCAG CTATATGGTT 1320 

GGTTGGGGCA AAGAGATGGC AAATTACTTC CTCCGCAAGT GATGCAACAA ACTGCATTGC 138 0 

AGATCGGTAC CGCCATTCTT AATCGGGAAG CGCATGACGA TGCGGGTTTT ACATGCGCTA 144 0 

TTAGTATTAT TACCCCCTCC GCAGCGTATA CTTTGGCCGA AGACTTCTCT TACCGAGATT 1500 

ATCTTCATGG AGCATTTGCT ATGAGTTTTA CTTCACTTCC TCTGACGGAA ATTAACCATA 1560 

AGCTACCCGC TCGAAATATT ATTGAGTCAC AGTGGATAAC ATTACAATTA ACTTTATTTG 162 0 

CGCAAGAGCA ACAAGCTAAG AGAGTTTCAC ATGCTATTGT GAGCTCCGCT TACCGTAAGG 168 0 

CTGAAAAAAT CATCCGAGAC GCCTATCGTT ATCAGCGTGA ACAGAAAGTT GAGCAGCAAC 1740 

AAGAACTAGC GTGCTTGCGT AAAAATACGC TGGAAAAAAT GGAAGTGGAA TGGCTGGAAC 1800 

AGCATGTAAA ACATTTACAA GACGATGAAA ATCAATTTCG TTCATTGGTC GATCACGCAG 186 0 

CGCATCATAT TAAAAATAGT ATAGAACAGG TTCTGTTGGC CTGGTTCGAC CAACAGTCGG 1920 

TAGACAGTGT TATGTGC CAT CGTCTGGCAC GCCAGGCCAC GGCTATGGCG GAAGAGGGAG 198 0 
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CGCTTTATTT GCGTATTCAT CCTGAAAAAG AGGCATTGAT GCGAGAAACT TTTGGCAAGC 2 04 0 

GGTTTACGTT GATTATCGAG CCTGGTTTCT CTCCCGATCA GGCTGAACTT TCCTCAACAC 2100 

GATATGCCGT TGAATTTTCA CTTTCTCGTC ATTTCAACGC GTTACTGAAA TGGTTACGTA 2160 

ATGGTGAAGA TAAAAGAGGT AGCGATGAAT ATTAAAATTA ATGAGATAAA AATGACGCCC 222 0 

C CTACAGC AT TTACCCCTGG CCAGGTTATA GAGGAACAAG AGGTTATTTC GCCTTCAATG 2280 

TTAGCTCTCC AGGAGTTACA GGAAACGACG GGGGCAGCGC TCTATGAGAC GATGGAAGAA 2340 

ATAGGAATGG CGCTGAGTGG TAAACTGCGC GAAAATTATA AATTCACTGA TGCTGAGAAA 2400 

CTGGAGCGCA GACAGCAGGC TTTGCTGCGT TTGATAAAAC AAATACAGGA GGATAATGGG 246 0 

GCAACGTTGC GTCCGCTTAC CGAAGAGAAT AGTGATCCTG ATTTACAGAA TGCGTATCAA 2520 

ATTATCGCTC TTGCAATGGC GCTTACTGCC GGCGGGTTGT CAAAAAAGAA AAAACGCGAT 2580 

TTGCAATCGC AACTGGATAC GTTACAGCGG AGGAGGGATG GGAACTTGCC GTTTTTAGTT 264 0 

TACTGGAACT TGGCGAAGTG GATACCGTAC GCTGTCCTCT CTGAAGCGTT TTATGCAACA 2700 

GGCGATAGAC AACGATGAAA TGCCCTTATC GCAGTGGTTC AGACGCGTGG CAGACTGGCC 276 0 

GGATCGCTGT GAACGGGTCC GTATTTTGCT AAGAGCAGTA GCCTTTGAAC TTAGCATATG 282 0 

CATCGAACCC TCGGAGCAAA GTCGTTTGGC CGCAGCATTA GTACGTTTGC GTCGTTTGCT 288 0 

GTTATTCCTT GGCCTTGAAA AAGAGTGCCA GCGTGAGGAG TGGATTTGCC AGTTGCCGCC 2 94 0 

TAATACATTA CTGCCGCTAC TACTCGATAT TATTTGTGAG CGCTGGCTTT TCAGTGATTG 3 000 

GTTGCTTGAT AGACTTACCG CTATAGTTTC TTCATCGAAG ATGTTCAATC GGTTACTCCA 3060 

ACAACTTGAT GCGCAGTTTA TGCTGATACC CGATAACTGT TTTAACGACG AAGATCAACG 312 0 

TGAACAAATT CTCGAAACGC TTCGTGAAGT AAAGATAAAT CAGGTTTTAT TCTGATACCT 318 0 

GGCTTTCAAT ATTTAGGTAA ATTGGCTTTC TGGCTCATCA TGAGGCGTCA GGATGGATTG 324Q 

GGATCTCATT ACTGAACGTA ATATTCAGCT TTTTATTCAA TTAGCAGGAT TAGCTGAACG 3300 

GCCTTTAGCA ACCAATATGT TCTGGCGGCA AGGACAATAT GAAACTATCA TAACGGTCGT 336 0 

ATTCTCTTAT GTCAGATACT CAAGCAAACC TTCTTAGACG AAGAACTGCT TTTTAAAGCG 3420 

TTGGCTAACT GGAAACCCGC AGCGTTCCAG GGTATTCCTC AACGATTATT TTTGTTG CGC 348 0 

GATGGGCTTG CAATGAGTTG TTCTCCACCT CTTTCCAGCT CCGCCGAGCT CTGGTTACGA 354 0 

TTACATCATC GACAAATAAA ATTTCNTGGA GTCGCAATGC GTTCATGGTT AGGTGAGGGA 3 600 

GTCAGGGCGC AACAGTGGCT CAGTGTATGC GCGGGTCGGC AGGATATGGT TCTGGCGACG 3660 

GTGTTATTAA TCGCTATTGT GATGATGCTG TTACCCTTGC CGACCTGGAT GGTTGATATC 372 0 
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CTGATTACTA TCAACCTTAT GTTTTCAGTG ATCCTGCTCT TAATTGCTAT TTATCTTAGT 378 0 

GACCCTCTCG ATTTATCGGT ATTTCCGTCT TTATTACTTA TTACTACATT ATATCGTTTG 384 0 

TCACTCACAA TCAGCACATC ACGGCTGGTA CTGTTACAAC ATAATGCCGG TAATATTGTG 3 900 

GATGCTTTCG GTAAGTTTGT CGTAGGAGGA AATCTCACCG TTGGGTTGGT CGTATTTACC 396 0 

ATCATTACTA TCGTGCAATT TATTGTCATT ACAAAAGGTA TCGAGAGGGT GGCGGAAGTT 402 0 

AGCGCACGTT TCTCGCTTGA TGGGATGCCA GGCAAACAAA TGAGTATCGA TGGCGATTTG 408 0 

CGTGCCGGAG TTATCGATGC AGACCATGCC CGTACATTAA GACAGCATGT CCAGCAGGAA 414 0 

AGCCGCTTTC TCGGTGCGAT GGACGGTGCG ATGAAATTTG TTAAAGGCGA TACGATTGCC 4200 

GGTATTATTG TTGTTCTGGT GAACATTATC GGCGGTATCA TTATCGCTAT CGTACAATAT 426 0 

GATATGTCGA TGAGTGAGGC TGTTCACACT TATAGCGTAC TGTCAATCGG AGATGGTTTA 4320 

TGTGGGCAAA TTCCATCGCT GCTGATTTCC CTTAGCGCGG GAATTATTGT CACCCGTGTC 4 38 0 

CCGGGTGAGA AACGCCAGAA CCTGGCGACA GAGTTGAGTT CTCAAATTGC CAGACAACCT 4440 

CAGTCGCTCA TATTAAC CGC TGTGGTTTTA ATGCTCCTCG GTTTAATTCC TGGCTTTCCT 4500 

TTTATCACTC TCGCTTTCTT TTCAGCGTTG TTAGCATTGC CAATTATCCT CATTCGCCGC 4 560 

AAAAAGTCTG TGGTTTCCGC AAATGGCGTC GAAGCACCGG AAAAAGATAG TATGGTTCCC 4 620 

GGCGCATGTC CTCTAATCTT ACGTCTTAGC CCGACGTTAC ATTCTGCCGA CCTGATTCGT 468 0 

GATATTGACG CCATGAGATG GTTTTTATTT GAGGATACCG GCGTCCCTCT CCCTGAGGTG 4 74 0 

AATATTGAGG TTTTGCCTGA ACCCACCGAA AAATTGACGG TACTGCTATA TCAGGAACCC 4 8 00 

GTATTTAGTT TATCTATTCC CGCTCAGGCG GATTATTTAT TGATAGGCGC GGACGCTAGT 486 0 

GTGGTGGGTG ACAGC CAGAC GTTACCGAAC GGGATGGGGC AGATCTGTTG GCTTACAAAA 492 0 

GACATGGCCC ATAAGGCGCA AGGTTTTGGA CTGGACGTTT TCGCGGGCAG CCAACGTATC 4980 
TCTGCCTTAT TAAAATGTGT CCTGCTTCGG CATATGGGAG AGTTTATTGG TGTTCAGGAA 5 04 0 
ACGCGTTATC TAATGAATGC GATGGAAAAA AACTACTCTG AGCTGGTGAA AGAGCTTCAG 510 0 

CGCCAGTTAC CCATTAATAA AATCGCTGAA ACTTTGCAAC GGCTTGTATC AGAGCGGGTT 516 0 

TCTATTAGAG ATTTACGTCT TATTTTCGGC ACCTTAATTG ACTGGGCGCC ACGTGAAAAA 522 0 
GATGTCCTGA TGTTGACAGA AT ATGTC CGT ATCGCGCTTC GTCGTCATAT TCTGCGTCGT 528 0 
CTTAATCCGG AAGGAAAACC GCTGCCGATT TTGCGGATCG GCGAAGGTAT TGAAAACCTC 534 0 
GTGCGTGAAT CCATTCGCCA GACGGCAATG GGGAC CTATA CTGCGCTGTC GTCTCGTCAT 5400 
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AAGACGCAGA TCCTGCAACT TATCGAGCAG GCGCTGAAGC AGTCAGCCAA ATTATTCATT 5460 

GTCACTTCTG TCGACACCCG ACGTTTCTTG CGAAAAATTA CAGAAGCCAC CTTGTTCGAC 5520 

GTAC CGATTT TGTCATGGCA GGAATTAGGA GAGGAGAGCC TTATACAAGT GGTAGAAAGT 558 0 

ATTGACCTTA GCGAAGAGGA GTTGGCGGAC AATGAAGAAT GAATTGATGC AACGTCTGAG 564 0 

GCTGAAATAT CCGCCCCCCG ATGGTTATTG TCGATGGGGC CGAATTCAGG ATGTCAGCGC 5700 

AACGTTGTTA AATGCGTGGT TGCCTGGGGT ATTTATGGGC GAGTTGTGCT GTATAAAGCC 5760 

TGGAGAAGAA CTTGCTGAAG TCGTGGGGAT TAATGGCAGC AAAGCTTTGC TATCTCCTTT 582 0 

TACGAGTACA ATCGGGCTTC ACTGCGGGCA GCAAGTGATG GCCTTAAGCG ACGCCATCAG 5880 

GTTCCCGTGG GCGAAGCGTT ATTAGGGCGA GTTATTGATG GCTTTGGTCG TCCCCTTGAT 5940 

GGCCGCGAAC TGCCCGACGT CTGCTGGAAA GACTATGATG CAATGCCTCC TCCCGCAATG 6 000 

GTTCGACAGC CTATCACTCA AC C ATT AATG ACGGGGATTC GCGCTATTGA TAGCGTTGCG 6060 

ACCTGTGGCG AAGGGCAACG AGTGGGTATT TTTTCTGCTC CTGGCGTGGG GAAAAGCACG 612 0 

CTTCTGGCGA TGCTGTGTAA TGCGCCAGAC GCAGACAGCA ATGTTCTGGT GTTAATTGGT 618 0 

GAACGTGGAC GAGAAGTCCG CGAATTCATC GATTTTACAC TGTCTGAAGA GACC CGAAAA 624 0 

CGTTGTGTCA TTGTTGTCGC AAC CTCTGAC AGACCCGCCT TAGAGCGCGT GAGGGCGCTG 6300 

TTTGTGGCCA CCACGATAGC AGAATTTTTT CGCGATAATG GAAAGCGAGT CGTCTTGCTT 636 0 

GCCGACTCAC TGACGCGTTA TGCCAGGGCC GCACGGAAAT CGCTCTGGCG CCGGAGAGAC 642 0 

CGCGGTTTCT GGAGAATATC GCCAGGCGTA TTTAGTGCAT TGC CACGACT TTTAGAACGT 648 0 

ACGGGAATGG GAGAAAAAGG CAGTATTACC GCATTTTATA CGGTACTGGT GGAAGGCGAT 654 0 

GATATGAATG AAGCCGTTGG CGGATGAAGT CCGTTCACTG CTTGATGGAC ATATTGTACT 6600 

ATCCCGACGG CTTGCAGAGA GGGGGCATTA TCCTGCCATT GACGTGTTGG CAACGCTCAG 666 0 

CCGCGTTTTT CCAGTCGTTA CCAGCCATGA GCATCGTCAA CTGGCGGCGA TATTGCGACG 672 0 

GTGCCTGGCG CTTTACCAGG AGGTTGAACT GTTAATACGC ATTGGGGAAT ACCAGCGAGG 678 0 

AGTTGATACA GATACTGACA AAGCCATTGA TACCTATCCG GATATTTGCA CATTTTTGCG 6 84 0 

ACAAAGTAAG GATGAAGTAT GCGGACCCGA GCTACTTATA GAAAAATTAC ACCAAATACT 6900 

CAC CGAGTGA TCATGGAAAC TTTGCTGGAG ATAATCGCGC GGCTGAAAAG CAATTACGCG 6 96 0 

GCAAGCTTAC CGTACTTGAT CAGCAGCAAC AGGCGATTAT TACGGAACAG CAGATTTGCC 7 020 

AGACGCGCGC TTTAGCAGTG TCTACCAGAC TGAAAGAATT AATGGGCTGG CAAGGTACGT 7 08 0 

TATCTTGTCA TTTATTGTTG GATAAGAAAC AACAAATGGC CGGGTTATTC ACTCAGGCGC 714 0 
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AGAGCTTTTT GACGCAACGG CAAGCAGTTA GAGAATCAGT ATCAGCAGCT TGTCTCCCGG 7200 

CGAAGCGAAT TACAGAAGAA TTTTAATGCG CTTATGAAAA AGAAAGAAAA AATTACTATG 726 0 

GTATTAAGCG ATGCGTATTA CCAAAGTTGA GGGAAGTCTT GGGTTGCCAT GCCAGTCTTA 732 0 

TCAGGATGAT AACGAGGCGG AGGCGGAACG TATGGACTTT GAACAACTCA TGCACCAGGC 7 38 0 

ATTACCCATT GGTGAGAATA ATCCTCCTGC AGCATTGAAT AAGAACGTGG TTTTCACGCA 744 0 

ACGTTATCGT GTTAGTGGCG GTTATCTTGA CGGTGTAGAG TGTGAAGTAT GTGAATCAGG 7500 

GGGGCTAATC CAGTTAAGAA TCAATGTCCC TCATCATGAA ATTTACCGTT CGATGAAAGC 7560 

GCTAAAGCAG TGGCTGGAGT CTCAGTTGCT GCATATGGGG TATATAATTT CCCTGGAGAT 762 0 

ATTCTATGTT AAGAATAGCG AATGAAGAGC GTCCGTGGGT GGAGATACTT CCAACGCAAG 768 0 

GCGCTACCAT TGGTGAGCTG ACATTGAGTA TGCAACAATA TCCAGTACAG CAAGGGACAT 774 0 

TATTTACCAT AAATTATCAT AATGAGCTGG GTAGGGTGTG GATTGCAGAA CAATGCTGGC 7800 

AGCGCTGGTG TGAAGGGCTA ATTGGCACCG CTAATCGATC GGCTATCGAT CCTGAATTGC 7860 

TATATGGAAT AGCTGAATGG GGGCTGGCGC CGTTATTGCA AGCCAGTGAT GCAACCCTCT 7 920 

GTCAGAACGA GCCGCCAACA TCCTGCAGTA ATCTACCACA TCAGCTAGCG TTGCATATTA 7980 

AATGGACAGT TGAAGAGCAT GAGTTCCATA GCATTATTTT TACATGGCCA ACGGGTTTTT 8 04 0 

TGCGCAATAT AGTCGGAGAG CTTTCTGCTG AGCGACAACA GATTTATC CT GCCCCTCCTG 8100 

TGGTAGTCCC TGTATATTCA GGCTGGTGCC AGCTTACATT AATCGAACTT GAGTCTATCG 8160 

AAATCGGCAT GGGCGTTCGG ATTCATTGCT TCGGCGACAT CAGACTCGGT TTTTTTGCTA 822 0 

TTCAACTACC TGGGGGAATC TACGCAAGGG TGTTGCTGAC AGAGGATAAC ACGATGAAAT 828 0 

TTGACGAATT AGTCCAGGAT ATCGAAACGC TACTTGCGTC AGGGAGCCCA ATGTCAAAGA 8 34 0 

GTGACGGAAC GTCTTCAGTC GAACTTGAGC AGATACCACA ACAGGTGCTC TTTGAGGTCG 84 00 

GACGTGCGAG TCTGGAAATT GGACAATTAC GACAACTTAA AACGGGGGAC GTTTTGCCTG 846 0 

TAGGTGGATG TTTTGCGCCA GAGGTGACGA TAAGAGTAAA TGAC CGTATT ATTGGGCAAG 852 0 

GTGAGTTGAT TGCCTGTGGC AATGAATTTA TGGTGCGTAT TACACGTTGG TATCTTTGCA 858 0 

AAAATACAGC GTAAACCTGA TAAGAAAAAT AATATGCGAA CAATATAATA GCGTTCCAGG 864 0 

TCGTGTCATG AGAGATACAG TATGTCTTTA CCCGATTCGC CTTTGCAACT GATTGGTATA 8700 

TTGTTTCTGC TTTCAATACT GCCTCTCATT ATCGTCATGG GAACTTCTTT CCTTAAACTG 876 0 

GCGGTGGTAT TTTCGATTTT ACGAAATGCT CTGGGTATTC AACAAGTCCC CCCAAATATC 882 0 
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GCACTGTATG GCCTTGCGCT TGTACTTTCC TTATTCATTA TGGGGCCGAC GCTATTAGCT 8880 

GTAAAAGAGC GCTGGCATCC GGTTCAGGTC GCTGGCGCTC CTTTCTGGAC GTCTGAGTGG 8 94 0 

GACAGTAAAG CATTAGCGCC TTATCGACAG TTTTTGCAAA AAAACTCTGA AGAGAAGGAA 9000 

GCCAATTATT TTCGGAATTT GATAAAACGA ACCTGGCCTG AAGACATAAA AAGAAAGATA 9 060 

AAACCTGATT CTTTGCTCAT ATTAATTCCG GCATTTACGG TGAGTCAGTT AACGCAGGCA 912 0 

TTTCGGATTG GATTACTTAT TTATCTTCCC TTTCTGGCTA TTGACCTGCT TATTTCAAAT 918 0 

ATACTGCTGG CTATGGGGAT GATGATGGTG TCGCCGATGA CCATTTCATT ACCGTTTAAG 9240 

CTGCTAATAT TTTTACTGGC AGGCGGTTGG GATCTGACAC TGGCGCAATT GGTACAGAGC 9300 

TTTTCATGAA TGATTCTGAA TTGACGCAAT TTGTAACGCA ACTTTTATGG ATCGTCCTTT 9360 

TTACGTCTAT GCCGGTAGTG TTGGTGGCAT CGGTAGTTGG TGTCATCGTA AGCCTTGTTC 9420 

AGGCCTTGAC TCAAATACAG GACCAAACGC TACAGTTCAT GATTAAATTA TTGGCAATTG 948 0 

CAATAACCTT AATGGTCAGC TAC CCATGGC TTAGCGGTAT CCTGTTGAAT TATACCCGGC 954 0 

AGATAATGTT ACGAATTGGA GAGCATGGTT GAATGGCACA ACAGGTAAAT GAGTGGCTTA 9600 

TTGCATTGGC TGTGGCTTTT ATTCGACCAT TGAGCCTTTC TTTATTACTT CCCTTATTAA 9660 

AAAGTGGCAG TTTAGGGGCC GCACTTTTAC GTAATGGCGT GCTTATGTCA CTTAC CTTTC 9720 

CGATATTACC AATCATTTAC CAGCAGAAGA TTATGATGCA TATTGGTAAA GATTACAGTT 9780 

GGTTAGGGTT AGTCACTGGA GAGGTGATTA TTGGTTTTTC AATTGGGTTT TGTGCGGCGG 9840 

TTCCCTTTTG GGCCGTTGAT ATGGCGGGGT TTCTGCTTGA TACTTTACGT GGCGCGACAA 9900 

TGGGTACGAT ATTCAATTCT ACAATAGAAG CTGAAACCTC ACTTTTTGGC TTGCTTTTCA 996 0 

GCCAGTTCTT GTGTGTTATT TTCTTTATAA GCGGCGGCAT GGAGTTTATA TTAAACATTC 10020 

TGTATGAGTC ATATCAATAT TTACCACCAG GGCGTACTTT ATTATTTGAC CAGCAATTTT 10080 

TAAAATATAT CCAGGCAGAG TGGAGAACGC TTTATCAATT ATGTATCAGC TTCTCTCTTC 10140 

CTGCCATAAT ATGTATGGTA TTAGCCGATC TGGCTTTAGG TCTTTTAAAT CGGTCGGCAC 10200 

AACAATTGAA TGTGTTTTTC TTCTCAATGC CGCTCAAAAG TATATTGGTT CTACTGACGY 10260 

CCTGATCTCA TTCCCTTATG CTCTTCATCA CTATTTGGTT GAAAGCGATA AATTTTATAT 10320 

TTATCTAAAA GACTGGTTTC CATCTGTATG AGCGAGAAAA CAGAACAGCC TACAGAAAAG 10380 

AAATTACGTG ATGGCCGTAA GGAAGGGCAG GTTGTCAAAA GTATTGAAAT AACATCATTA 10440 

TTTCAGCTGA TTGCGCTTTA TTTGTATTTT CATTTCTTTA CTGAAAAGAT GATTTTGATA 10500 

CTGATTGAGT CAATAACTTT CACATTACAA TTAGTAAATA AACCATTTTC TTATGCATTA 10560 
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ACGCAATTGA GTCATGCTTT AATAGAGTCA CTGACTTCTG CACTGCTGTT TCTGGGCGCT 10620 

GGGGTAATAG TTGCTACTGT GGGTAGCGTG TTTCTTCAGG TGGGGGTGGT TATTGCCAGC 10680 

AAGGCCATTG GTTTTAAAAG CGAGCATATA AATCCGGTAA GTAATTTTAA GCAGATATTC 10740 

TCTTTACATA GCGTAGTAGA ATTATGTAAA TCCAGCCTAA AAGTTATCAT GCTATCTCTT 10800 

ATCTTTGCCT TTTTCTTTTA TTATTATGCC AGTACTTTTC GGGCGCTACC GTACTGTGGG 10860 

TTAGCCTGTG GCGTGCTTGT GGTTTCTTCT TTAATAAAAT GGTTATGGGT AGGGGTGATG 10920 

GTTTTTTATA TCGTCGTTGG CATACTGGAC TATTCTTTTC AATATTATAA GATTAGAAAA 10980 

GCTATCTAAA AATGAGTAAA GATGACGTAA AACAGGAGCA TAAAGATCTG GAGGGCGACC 11040 

CTCAAATGAA GACGCGGCGT CGGAAATGCA GAGTGAAATA CAAAGTGGGA GTTTAGCTCA 11100 

ATCTGTTAAA CAATCTGTTG CGGTAGTGCG TAATC CAACG CATATTGCGG TTTGTCTTGG 11160 

CTATCATCCC AC CGATATGC CAATACCACG CGTCCTGGAA AAAGGCAGTG ATGCTCAAGC 11220 

TAACTATATT GTTAACATCG CTGAACGCAA CTGCATCCCC GTTGTTGAAA ATGTTGAGCT 11280 

GGCCCGCTCA TTATTTTTTG AAGTGGAACG CGGAGATAAA ATTC CTGAAA CGTTATTTGA 11340 

ACCCGTTGCA GCCTTGTTAC GTATGGTGAT GAAGATAGAT TATGCGCATT CTACCGAAAC 11400 

AC C ATAAATG CTTTTGGTAT GCTTCTTCAG GCCACTGCGA AGGTTAAGAG GGTAATAGCG 11460 

TATAGAGCAG TGCTTGACGA TAAAGGTGAG AGACTGAAAA TAATCGCTTT TAGCCTGGCA 11520 

CAAGCACCAG ATAGCGTATT ATAAAATTAA ACAAGATAAT GGATTGGTGC GTCTGAATGG 11580 

ACTCGAACCA CTCGACCCCC AC CATGTCAA GGTGGTGCTC TAACCAACTG AGCTATGAAC 11640 

GGCAACGTTG TAGGTGACAA CGGGGACGAA TATTAGCGTC ACAAC CGCAA TGAGGCAAGA 11700 

GGGAAATCGC AATTTTCTTC CTGAAATCAC CTGATTGCGG TGGAAATATG CAACATGTCG 11760 

AGAAAATAGC CGCCATGCGA CGGCTATCGT CGTATTATCG GAGCGCGCTG CAAAATGATG 11820 

GCGGACGGCT GACGTTGTAG ATAGCGCATC CGTAGCATCA TTAACACCGC CGCCGAGGTC 1188 0 

AGGC CGATGA TGAACCCCAT CCAGAAGCCT GCCGGTCCCA TACGATCCAC CACCAAATCC 1194 0 

GTTAACGCCA GGATATAACC GCTGGGTAAA CCTAACACCC AGTAGGCGGT AAAGGTGATA 12000 

AAAAAGATGG AACGCGTATC TTTATAACCG CGCAGAATAC CGCTGCCGAT AAC CTGTATA 12060 

GAGTCGGAAA TCTGGTAAAC CGCAGCGAGC AGCATTAATT GCGGCAAGCG CCACGACCTC 12120 

AGGGTTGTCA TTGTAGAGCA AAGCAATATG CTTACGCAGA GTAACGGTAA AAATAGCGGT 1218 0 

AACCACAGCC ATACAAATGC CGACGCCTAA ACCGGTACGC GCTGCGTTTG CGCATCCAGC 12240 
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GTTGAGCCCT GGCCCAGACC GATAACCCAC TCGAATCGTT ACCGCCGCAG CCAGCGACAT 12300 

CGGCAGTACG AACATCAGCG AGCTAAAGTT AAGCGCAATC TGATGACCGG CGACATCCAC 12 36 0 

AATAC CTAAT GGCGAAACCA GCAGCGCAAC GAC CGCAAAT AACGTCACTT CAAAGAACAG 12420 

CCAGCGCAAT CGGCAACCCC AGTTGAATCA GGCGCTTCAT GACGACGCTA TCGGGTTTGC 1248 0 

CAAAGCCTTT TTCATTACGA ATATCACGCA TTGAACGCGC GTGTTTAATG TAAGAAAGCA 1254 0 

TGGCGATAAA CATCACCCAA TAGACCGCCG CAGTCGCAAC GCGGCAGCCG ATACCGCCGA 12600 

GTTCCGGCAT AC CAAAATGG CCATAGATAA AAATATAGTT C AC CGGAATA TTCACCAGCA 12 660 

GGCCCAAAAA TCCCATCACC ATACCCGGTT TGGTTTTGGC CAGACCTTCG CACTGGTTTC 1272 0 

GCGCTACCTG AAAGAAAAGG TATCCTGCGC CCCACAGCAG CGCGCGAAGA TAACCCACGG 1278 0 

CTTTATCGGC CAGCGCCGGA TCAATATTAT GCATAGAGCG GATAATGTAT CCGGCATTCC 12840 

ACAGGACGAT CATCACCAGC ACGGAGACAA AGCCCGCCAG CCAGAACCCT TGTCGAACCT 12 900 

GATGCGCGAT ACGCTCACGA CGGCCGGAGC CATTGAGTTG CGCAATCACA GGCGTCAAGG 12 96 0 

CCAGCAGTAA GCCGTGACCA AACAAAATGG CGGGAAGCAG ATAGAGGTGC CGATAGCGAC 13020 

GGCAGCCATG TCCGTAGCGC TATAGCCTCC CGCCATGACG GTATCGACGA ATC CATTGCG 13 080 

GTCTATACCA CTTGCGCAAG GATCACCGGT ATCTGAACGC TAATAACTGA CGCGCTTCAC 13140 

TGGTATACTT CTGCACGTAT TCACCTTTTA TTTTGTTGTT ATATGAAAGA CTAAAAAGCC 13200 

GCCGAAGTGG CAGC CAAAAG AAATAGCAGG GGAAATTTCA GTCTATTGTA GCGGGGTATT 13260 

ACTATTTCTC CAGTGAAAAA ACAGTTGTTA ACGGCGCATT GCTGGCAAGC TGTTTTTCCA 1332 0 

CCTGCTATTG TGCTGAACAG TTCTGCTTTT ATTTATTTCA GGAGTTGAAG ATATGTTTAC 13380 

GGGGATCGTA CAGGGTACCG CGAAACTGGT ATCGATA 13417 

(2) INFORMATION FOR SEQ ID NO: 38: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5746 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: DNA sequence of VGC II cluster C 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

GGATCCTTTT TCTTTAATGC TGCTAACGTT TCTTGCAAAA TGCGTTGATG AGATTCATCC 60 

AGTACACCAC TGATAACAAA AGAGCGNCGC ATTGGCNWAM MWTKRNNMRN NSCNNNACTA 12 0 
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AACCGTTCTC TATTATCGCA GAAATAATAT CATCCCCCTG AGACTGATGA GAGTGACTAA 180 

TCTGCCAGTG CAATAACCCG GGAATATCTG CAAGTAATGG TTGAACCTTG CGCCATTGCT 24 0 

GATCCATTTG TATATCATCA TGAATTAACA CGCTCCCCGG CCCTTCGCTG GATACTTCAG 300 

CATNSSGGTA ACCCATTTTT ATCAAAACAT CCTGCACTTC TCGTACCAAT AAGTCATCAC 36 0 

AGATTACACC ATCCCGATAC ATGACCCCCC ATGATTCGAG AGTCGCTCTC AC CTTTTGC A 42 0 

TCTGTTCGCT TGACGAGCAA TAACCGGACA ACTGCAGGCT GCCATCTTCT TTCCATTGCG 480 

CCCGCACATA ATGAATATTG CTTTTGTCTA ATAAAAACTT AACCCGCAAA GGTAAGTCAT 54 0 

TTACCGTTTC AGGCTGACCA CTAATACTTA ACAGGACACC CATTCCACCG ATGAAAATCA 600 

AGAATACGCC AGCCAACCAC CAGTACCCTG ATCTGGAAAC GGGTATTTGA TAATCAGCAA 66 0 

GTTCACAATC CTGTTTACCA AACGCGATAS SCACTCCCGC AACCTGCAAA ACCCCACTGG 720 

ATGGTAGCGG CTTATTTGGA TTAAATCTGC GGCCATTAAC TCTAACTCTG GCTTTCCCGG 78 0 

CATCAACAAA TAAACTATCT GCCTGTTCTC TCAGAATAAT TTTTTCATTT ATAGC CAGCG 840 

AATACAAATA TCGCATCCCT TCTCCCCCAG TGACAGGTTA CCTTCATTCA GCCATACTTC 900 

CCGGCCTTGT AAAACGTGAC CTAAAAAACG TATTTTC C AG GAACTCTTTG GATTAACCAT 960 

GAGATATGCC ATTATTTACT ACTGAGGCTT TAATCAAAAA AAGC CTGATT ACACTATGTA 1020 

CTTGAGTCGT ATCATTGCGA AACAAATGAC CTACAACAGG AATATCGCCC AATAAAGGGA 108 0 

TTTTGTTTTG CGAGTGGATT TGTTTACCTT GTTTAAACCC TCCCAGCAAT NAGACTTTGC 114 0 

CCGGCCAATA ATGTGGCTTG CGAANCRATT TCAGAATTTT GCACTTCGGG CAGCGGGTCT 1200 

GTNTYGCYTT KGNSTATCAC TTTGTTGTCC ATCCTGAANT ATTAAGATTA AGCATTATTT 126 0 

TTTGCGTGCC ATTGTCATTT AACAAGCGAG GTGTAACGCG WNAACAAAGA ACCCGTAGTG 132 0 

ATGGATTCAA GTTTAGCCAC TTTTTCTCCC TGCAGTTTGG TATAGAAAGT AATATTTTTA 138 0 

TCCAGCACAG CCTGGATATT ATTTAAAGTC ACCACAGATG GCTGGGAAAG TACATAAGCC 144 0 

TGAGAGCTTT TTTCCAGGGC ATTCAGACGC AC CAT AAAGT TTGAGGTATC GCTGATTACC 1500 

GTTGANNAAC CACTAGCACC AC CGTCATTC AAACCTGTAT TGAACGCAAT TTTCTTGCCA 156 0 

CCCAGCGACA CTGCCGTTCC CCAGTCGATG CCTAACTGGT TAATATCTCC AGCATTAACA 162 0 

TCGATAATTT TCACCGAAAT CTCTATCATC TGCTGGCGTT GATCTAATTC TGTGATGAGT 1680 

TTCCGATACN NNGCCATATT GGNNNCATAA TCACGAACGA TCACTGCATT CTGGCGTNGG 174 0 

GTCGGCAGCA AACATNGGCA ATGCCTGTGT AGCGGGTGAA CCATTGTTCN TCGATGACGT 180 0 
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CGGGACGCTG GTTTTACTCA TCTCACGCAA TACACTAACG ACCCCTGGNN AACCACGACG 186 0 

G ACTG ATCG C GATATTGGTA CTGGGTATCC ATCGCAGTGG CATACTTAAG CGTGTATATA 1920 

CTTACACTCA CCGCACTGTC TTTTCGTTTG ATTAACGCAT TATCCAGCAC TGAAGCTAAT 1980 

TGACTAATAC GAGTCAGGCA GCTGGGAACA CCGCTCACCT CCACAGCTTT GGTACCGGTA 2 04 0 

ATTTCT1TAA CCTCGCATCC CGGTGATGAA AGGATATTCT GGCTGCGTAA GTAATGAATG 210 0 

AACCGTCCAG TAGATAAAAT ATTGAAAGTG ATAACCTGAT GTTTTAATAA CGATGCAGGA 216 0 

TATACATATA ACATGCTGCC ATCAAACCAG GTAAGCAAAT CATATTGTGC TGCCAGGTTA 2220 

TTCAAAATAT CGACCGGTGG TCCAGGCGGA ATTTTTCCAC TAAATGTAGC TGTTATCAAT 228 0 

GGGCTAATAG TAATAGCCGT ATCATAGTTC TCTGAGAGCA GATGTNAAAA CCTCTGCTAA 2340 

TGGCATTTGT CTGGCATAAA GGGTGAAGTC ATTACCTTTC CATGATAACT CATCACTCTT 2400 

TGCTGTATTG AGTATAAATA GTAAAATTAA GATTAAACGT TTATTTACTA CCATTTTATA 2460 

CCCCACCCGA ATAAAGTTTA TGGTGATTGC GTATTACATT TTTTNAAAAT GCAAGTTAAA 252 0 

GCCAGGTGTT TTTCTATCTC AATAGCAATA AGCTCAGAGC TACTACTTGT GGTATAATAA 2 58 0 

CCGTTTAACC ATCCCCCATC CGCTGTGAGC TGTATAGCAT AATCATGGAC GTCCGGGTGT 2640 

GCGCAARCRG TAGTGTCAMM TAGGCAAGAC AAGGCTTAGG TAAGCTTTCC AGGTCATTTA 2700 

AGAACAAAGA AATAGAAAAT GCTTCTGAGA AAATTTCTYC YBHNNNNNNN NHNNNNNNNN 2760 

NNNNNNNNCA TCAATAGTCA TTATCCAGGA TSSKMTWWYM NYYKSSSCYS WKATMYYSWR 2 820 

WWTTAATGGA ATGCCTTTTA AAACTGCCAG CATGAATCCC TCCTCAGACA TAAATGGGAG 288 0 

TTTCTATCAA ATTCGCTCAC AACCACATCC GTAAAAAGCC TGATTCACAT TTATTTCGAC 2 94 0 

TATACTCTTC TTGTACAATA TCAGGATGCT GTCTACATAT ACCTTGTCAC AGGCGATTCT 3000 

ATCATTCGGA TTTTC CGATA AATTNMMCAA TTACATTTTC AGCATTGACA TAAAAACTTA ■ 3 060 

CAATTTGNAA AATTATTTAT TAAATAAACT GTTACGATGT TTTTACATCG CCATCTTATT 312 0 

AAAAAGTAAT TGTAGTCATC GACTNGGTTA TATATGAAGA AATTTATCTT CCTAATGATA 318 0 

ACACCATCGA TTAATCWWCT GATGAAACTA TATGTACTGC GATAGTGATC AAGTGCCAAA 324 0 

GATTTTGCAA CAGGCAACTG GAGGGAAGCA TTATGAATTT SSTCAATCTC AAGAATACSS 3 300 

YSYRNNNNNN TCTTTAGTAA TCAGGCTAAC TTTTTTATTT TTATTAACAA CAATAATTWT 336 0 

TTGGCTGCTA TCTGTGCTTA CCGCAGCTTA TATATCAATG GTTCRGAAAC GGCAGCATAT 342 0 

AATAGAGGAT TTATCCGTTC TATCCGAGAT GAATATTGTA CTAAGCAATC AACGGTTTGA 348 0 

AGAAGCTGAA CGTGACGCTA AAAATTTAAT GTATCAATGC TCATTAGCGA CTGAGATTCA 3 54 0 
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TCATAACGAT ATTTTCCCTG AGGTGAGCCG GCATCTATCT GTCGGTCCTT CAAATTGCAC 3600 

MGCCGACGCT NAACGGAGAG AAGCACCGTC TCTTTCTGCA GTCCTCTGAT ATCGATGAAA 3660 

ATAGCTTTCG TCGCGATAGT TTTATTCTTA ATCATAAAAA TGAGATTTCG TTATTATCTA 3720 

CTGATAACCC TTCAGATTAT TCAACTCTAC AGCCTTTAAC GCGAAAAAGC TTTCCTTTAT 378 0 

ACCCAACCCA TGCCGGGTTT TACTGGAGTG AACCAGAATA CATAAACGGC AAAGGATGGC 3840 

AACGCTTCCG TTGCGGTTGC CGATCAGGCA AGGCGTATTT TTTGAGGTGA CGGTTAAACT 3 900 

TCCCGATCTC ATTACTAAGA GCCACCTGCC ATTAGATGAT AGTATTCGAG TATGGCTGGA 3 96 0 

TCAAAACAAC CACTTATTGC CGTTTTCATA CATCCCGGCA AAAAATACGT ACACAGTTAG 4 02 0 

AAAATGTAAC GCTGCATGAT GGATGGCAGC AAATTCCCGG ATTTCTGATA TTACGCACAA 4080 

CCTTGCATGG CCCCGGATGG AGTCTGGTTA CGCTGTACCC ATACGGTAAT CTACATAATC 4140 

GCATCTTAAA AATTATC CTT CAACAAATCC C CTTTAC ATT AACAGCATTG GTGTTGATGA 4200 

CGTCGGCTTT TTGCTGGTTA CTACATCGCT CACTGGCCAA ACCGTTATGG CGTTTTGTCG 4260 

ATGTCATTAA TAAAACCGCA ACTGCACCGC TGAGCACACG TTTAC CAGCA CAACGACTGG 4 32 0 

ATGAATTAGA TAGTATTGC C GGTGCTTTTA ACCAACTGCT TGATACTCTA CAAGTCCAAT 438 0 

ACGACAATCT GGAAAACAAA GTCGCAGACG CACCCAGGCG CTAAATGAAG CAAAAAAACG 444 0 

CGCTGAGCNA GCTAACAAAC GTAAAAGCAT TCATCTTACG GTAATAAGTC ATGAGTTACG 4500 

TACTC CGATG AATGGCGTAC TCGGTGCAAT TGAATTATTA CAAACCACCC CTTTAAACAT 456 0 

AGAGCAACAA GGATTAGCTG ATACCGCCAG AAATTGTACA CTGTCTTTGT TAGCTATTAT 4620 

TAATAATCTG CTGGATTTTT CACGCATCGA GTCTGGTCAT TTCACATTAC ATATGGAAGA 468 0 

AACAGCGTTA CTGCCGTTAC TGGACCAGGC AATGCAAACC ATCCAGGGGC CAGCGCNAAA 474 0 

GCAAAAAACT GTCATTACGT ACTTTTGTCG GTCAACATGT CCCTCTCTAT TTTCATAC CG 4 800 

ACAGTATCCG TTTACNNCAA ATTTTGGTTA ATTTACTCGG GAACGCGGTA AAATTTACCG 4 86 0 

AAACCGGAGG ATACGTCTGA CGGTCAAGCG TCATGAGGAA CAATTAATAT TTCTGGTTAG 4 92 0 

CGATAGCGGT AAAGGGATTG AAATACAGCA GCAGTCTCAA ATCTTTACTG CTTTTTATCA 4 980 

AGCAGACACA AATTCGCAAG GTACAGGAAT TGGACTGACT ATTGCGTCAA GCCTGGCTAA 5 04 0 

AATGATGGGC GGTAATCTGA CACTAAAAAG TGTCCCCGGG GTTGGAACCT GTGTCTCGCT 5100 

AGTATTACCC TTACAAGAAT ACCAGCCGCC TCAACCAATT AAAGGGACGC TGTCAGNNNC 516 0 

CGTTCTGCCT GCATCGGCAA CTGGCTTGCT GGGGAATACG CGGTGAACCA CCCCACCAGC 522 0 
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AAAATGCGCT TCTCAANNCN AGAGCTTTTG TATTTCTCCG GAAAACTCTA CGACCTGGCG 



5280 



CAACAGTTAA TATTGTGTAC ACCAAATATG CCAGTAATAA ATAATTTGTT ACCACCCTGG 534 0 

C AGTTG CAGA TTCTTTTGGT TGATGATGCC GATATTAATC GGGATATCAT CGGCAAAATG 5400 

CTTGTCAGCC TGGGCCAACA CGTCACTATT GCCGCCAGTA GTAACGAGGC TCTGACTTTA 546 0 

TCACAACAGC AGCGATTCGA TTTAGTACTG ATTGACATTA GAATGCCAGA AATAGATGGT 552 0 

ATTGAATGTG TACGATTATG GCATGATGAG CCGAATAATT TAGATCCTGA CTGCATGTTT 558 0 

GTGGCACTAT CCGCTAGCGT AS C VNMAGAW RWTMWTCRTY GTDDAAAAAA WRDGRKDHWT 5640 

CATHAYANNT TACAAAACCA GTGACATTGG CTACCTTAGC TCGCTACATC AGTATTGCCG 5700 

CAGAATACCA ACTTTTACGA AATATAGAGC TACAGGAGCA GGATCC 5746 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 141 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCACCAGCCG CTGGGGTACC AGGGCCAGGC GACGGATATT GAAATTCACG CCCGCGAAAT 6 0 

TTTGAAAGTA AAAGGGCGCA TGAATGAACT TATGRMKYKM MATACGGGTC ANTCTCTTGA 12 0 

GCAGATTGAA SGTGATACTG A 141 

(2) INFORMATION FOR SEQ ID NO: 40: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

TGAAGCGGTA GAGTACGGTT TGGTTGACTC AATTTTGACC CATCGTAATT GATGCCCTGG 6 0 

ACGCAA 66 

(2) INFORMATION FOR SEQ ID NO: 41: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 141 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

CCAACCGTTG GGCGGCTACC AGGGCCAGGC GACCGATATC GAAATTCATG CCCGTGAAAT 6 0 

TCTGAAAGTT AAAGGGCGCA TGAATGAACT TATGGCGCTT CATACGGGTC AATCATTAGA 120 

ACAGATTGAA CGTGATACCG A 141 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

TGAAGCGGTG GAATACGGTC TGGTCGATTC GATTCTGACC CATCGTAATT GATGCCAGAG 6 0 

GCGCAA 66 

(2) INFORMATION FOR SEQ ID NO: 43: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

GATATCGAAA TTCATGCCCG TGAAATTCTG AAAGTTAAAG GGCGCATGAA TGAACTTATG 60 

GCGCTTCATA CGGGTCAATC ATTAGAACAG ATTGAACGTG ATACCGA 107 

(2) INFORMATION FOR SEQ ID NO: 44: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



TGAAGCGGTG GAATACGGTC TGGTCGATTC GATTCTGACC CATCGTAATT GATGCCAGAG 60 
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GCGCAA 



66 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Leu Gin Asn Arg Ala Arg Ser Lys Leu He Phe Leu Asn Asn Cys 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 

Lys Met Lys His Pro Pro Val Thr Pro Val Leu Tyr Tyr Arg Arg Leu 
15 10 15 

Cys 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Pro Asp Lys Trp He He Cys Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 48: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : 1 i near 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 



Ala Ser He He Leu Pro Glu Tyr His Gly Ala Ala Cys Gin Ala Leu 
15 10 15 

Asn Lys Leu Asp Asn Met Ala 
20 

(2) INFORMATION FOR SEQ ID NO: 49: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
Arg Phe He Val 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 79 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Tyr Phe Leu Leu Ser Leu Arg Ser Phe Leu Arg His Val Met Trp He 
15 10 15 

Phe He Ala His Cys Gin Lys Met Lys Arg He Lys Cys Trp His Tyr 
20 25 30 

Leu Cys Ser He He Leu Met Arg Lys Lys Thr Gly Arg Gly Trp Cys 
35 40 - 45 

Asn Leu Thr Cys Arg Ala Val Gly Ser Leu Leu Met Arg Leu Arg Leu 
50 55 60 

Leu Arg Leu Asn Gly Tyr Pro His Arg Ala Val Tyr Asn Gly Gly 
65 70 75 

(2) INFORMATION FOR SEQ ID NO: 51: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



Asp Val Ser Gly 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: 

Ser Val Ser Gly lie Thr Pro Gly Arg Thr Gly Arg Arg Leu He Phe 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Lys Asn Lys Glu Leu Lys Glu Cys 
1 5 

(2) INFORMATION FOR SEQ ID NO:54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Val Arg Trp Arg Gly Val He Asn Gly Lys Ser Asp His Cys Ala Thr 
1 5 10 15 

Asp Leu 

(2) INFORMATION FOR SEQ ID NO: 55: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO:55: 



Arg Phe Ser Glu Leu Ser Cys Arg He Tyr Lys He Phe Thr Ser Gly 
15 10 15 

Gin Tyr Gly Gly Leu Ser Gly Lys Asn 
20 25 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
{ D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: 

Arg Phe Asn Arg Asp Val Asn Pro Trp Val Ala He Gin 
15 10 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Tyr Leu Asp Ala Ala Cys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 58: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: 

He Gin Asn Gly Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 59: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Arg Thr Arg Glu Thr Asn He Leu Asp Tyr Gly Arg Tyr Gin Arg Gin 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 91 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Arg Glu Gly Gly Glu Val Val Asp Glu He Pro Leu Ser Val Asp Val 
15 10 15 

He Val Asp Arg Thr Val He Arg Ser Gly His Pro Asp Arg Leu Phe 
20 25 30 

Leu Pro Glu Thr Pro Phe Leu Ser Arg Pro Asp Pro Glu Val Leu Gin 
35 40 45 

Leu Tyr Arg Tyr Phe Trp Gin Pro Ala Arg Tyr Ala Val Pro Glu Trp 
50 55 60 

Leu Asp Lys Leu Gly Phe His Leu Gin Thr Ala Gly Val Met Ala He 
65 70 75 80 

Gly Pro Ser Trp He Val Phe Leu Thr Glu Arg 
85 90 

(2) INFORMATION FOR SEQ ID NO: 61: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino- acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Glu Glu Ala Leu Leu Phe Gin Pro Val 
1 5 

(2) INFORMATION FOR SEQ ID NO: 62: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 68 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Met Thr Gly Lys Asn Gly Arg Phe Val Leu Arg Arg Val Tyr Arg His 
15 10 15 

Leu Pro Leu Gly Trp Asp Tyr Ser Asn Ser Gly Val Val Thr He Leu 
20 25 30 

Cys Tyr Gin Ser He Gly Asn Cys Phe Tyr Ser Gly Leu Ala Arg Met 
35 40 45 

Arg Ser Gly Ser Tyr Met Val Gly Trp Gly Lys Glu Met Ala Asn Tyr 
50 55 60 

Phe Leu Arg Lys 
65 

(2) INFORMATION FOR SEQ ID NO:63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Cys Asn Lys Leu His Cys Arg Ser Val Pro Pro Phe Leu He Gly Lys 
15 10 15 

Arg Met Thr Met Arg Val Leu His Ala Leu Leu Val Leu Leu Pro Pro 
20 25 30 

Pro Gin Arg He Leu Trp Pro Lys Thr Ser Leu Thr Glu He He Phe 
35 40 45 

Met Glu His Leu Leu 
50 

(2) INFORMATION FOR SEQ ID NO: 64: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: 

Val Leu Leu His Phe Leu 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Arg Lys Leu Thr He Ser Tyr Pro Leu Glu He Leu Leu Ser His Ser 
15 10 15 

Gly 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Leu Tyr Leu Arg Lys Ser Asn Lys Leu Arg Glu Phe His Met Leu Leu 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Ala Pro Leu Thr Val Arg Leu Lys Lys Ser Ser Glu Thr Pro He Val 
15 10 15 

He Ser Val Asn Arg Lys Leu Ser Ser Asn Lys Asn 
20 25 

(2) INFORMATION FOR SEQ ID NO: 68: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
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Arg Ala Cys Val Lys lie Arg Trp Lys Lys Trp Lys Trp Asn Gly Trp 
15 10 15 



Asn Ser Met 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TO POLOGY : 1 i near 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Asn lie Tyr Lys Thr Met Lys lie Asn Phe Val His Trp Ser lie Thr 
15 10 15 

Gin Arg lie lie Leu Lys lie Val 
20 

(2) INFORMATION FOR SEQ ID NO: 70: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Asn Arg Phe Cys Trp Pro Gly Ser Thr Asn Ser Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 71: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) -TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

Thr Val Leu Cys Ala He Val Trp His Ala Arg Pro Arg Leu Trp Arg 
15 10 15 

Lys Arg Glu Arg Phe He Cys Val Phe He Leu Lys Lys Arg His 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 72: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 72 : 

Cys Glu Lys Leu Leu Ala Ser Gly Leu Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 

Leu Ser Ser Leu Val Ser Leu Pro lie Arg Leu Asn Phe Pro Gin His 
15 10 15 

Asp Met Pro Leu Asn Phe His Phe Leu Val lie Ser Thr Arg Tyr 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 74: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Asn Gly Tyr Val Met Val Lys lie Lys Glu Val Ala Met Asn He Lys 
1-5 10 15 

He Asn Glu He Lys Met Thr Pro Pro Thr Ala Phe Thr Pro Gly Gin 
20 25 30 

Val He Glu Glu Gin Glu Val He Ser Pro Ser Met Leu Ala Leu Gin 
35 40 45 

Glu Leu Gin Glu Thr Thr Gly Ala Ala Leu Tyr Glu Thr Met Glu Glu 
50 55 60 

He Gly Met Ala Leu Ser Gly Lys Leu Arg Glu Asn Tyr Lys Phe Thr 
65 70 75 80 

Asp Ala Glu Lys Leu Glu Arg Arg Gin Gin Ala Leu Leu Arg Leu He 
85 90 95 
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Lys Gin lie Gin Glu Asp Asn Gly Ala Thr Leu Arg Pro Leu Thr Glu 
100 105 HO 



Glu Asn Ser Asp Pro Asp Leu Gin Asn Ala Tyr Gin He He Ala Leu 
115 120 125 

Ala Met Ala Leu Thr Ala Gly Gly Leu Ser Lys Lys Lys Lys Arg Asp 
130 135 140 

Leu Gin Ser Gin Leu Asp Thr Leu Gin Arg Arg Arg Asp Gly Asn Leu 
145 150 155 160 

Pro Phe Leu Val Tyr Trp Asn Leu Ala Lys Trp He Pro Tyr Ala Val 
165 170 175 

Leu Ser Glu Ala Phe Tyr Ala Thr Gly Asp Arg Gin Arg 
180 185 

(2) INFORMATION FOR SEQ ID NO: 75: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:75: 

Asn Ala Leu He Ala Val Val Gin Thr Arg Gly Arg Leu Ala Gly Ser 
15 10 15 

Leu 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

- (ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Thr Gly Pro Tyr Phe Ala Lys Ser Ser Ser Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 77: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77; 



His Met His Arg Thr Leu Gly Ala Lys Ser Phe Gly Arg Ser He Ser 
15 10 15 

Thr Phe Ala Ser Phe Ala Val He Pro Trp Pro 
20 25 

(2) INFORMATION FOR SEQ ID NO:78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Lys Arg Val Pro Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Gly Val Asp Leu Pro Val Ala Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 80: 

Tyr lie Thr Ala Ala Thr Thr Arg Tyr Tyr Leu 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 81: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Ala Leu Ala Phe Gin 
1 5 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:82: 

Thr Tyr Arg Tyr Ser Phe Phe He Glu Asp Val Gin Ser Val Thr Pro 
15 10 15 

Thr Thr 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 i ne a r 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Cys Ala Val Tyr Ala Asp Thr Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 84: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: 

Arg Arg Arg Ser Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 85: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE; protein 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Thr Asn Ser Arg Asn Ala Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Ser Lys Asp Lys Ser Gly Phe He Leu He Pro Gly Phe Gin Tyr Leu 
! 5 10 15 

Gly Lys Leu Ala Phe Trp Leu He Met Arg Arg Gin Asp Gly Leu Gly 
20 25 30 

Ser His Tyr 
35 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Tyr Ser Ala Phe Tyr Ser He Ser Arg He Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 88: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL; NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88; 

Thr Ala Phe Ser Asn Gin Tyr Val Leu Ala Ala Arg Thr lie 
15 10 

(2) INFORMATION FOR SEQ ID NO: 89: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 759 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Asn Tyr His Asn Gly Arg lie Leu Leu Cys Gin lie Leu Lys Gin Thr 
15 10 15 

Phe Leu Asp Glu Glu Leu Leu Phe Lys Ala Leu Ala Asn Trp Lys Pro 
20 25 30 

Ala Ala Phe Gin Gly lie Pro Gin Arg Leu Phe Leu Leu Arg Asp Gly 
35 40 45 

Leu Ala Met Ser Cys Ser Pro Pro Leu Ser Ser Ser Ala Glu Leu Trp 
50 55 60 

Leu Arg Leu His His Arg Gin lie Lys Phe Xaa Gly Val Ala Met Arg 
65 70 75 80 

Ser Trp Leu Gly Glu Gly Val Arg Ala Gin Gin Trp Leu Ser Val Cys 
85 90 95 

Ala Gly Arg Gin Asp Met Val Leu Ala Thr Val Leu Leu lie Ala lie 
100 105 HO 

Val Met Met Leu Leu Pro Leu Pro Thr Trp Met Val Asp lie Leu lie 
115 120 125 

Thr lie Asn Leu Met Phe Ser Val lie Leu Leu Leu lie Ala He Tyr 
130 135 140 

Leu Ser Asp Pro Leu Asp Leu Ser Val Phe Pro Ser Leu Leu Leu He 
145 150 155 160 

Thr Thr Leu Tyr Arg Leu Ser Leu Thr He Ser Thr Ser Arg Leu Val 
165 170 175 

Leu Leu Gin His Asn Ala Gly Asn He Val Asp Ala Phe Gly Lys Phe 
180 185 190 

Val Val Gly Gly Asn Leu Thr Val Gly Leu Val Val Phe Thr He He 
195 200 205 

Thr He Val Gin Phe He Val He Thr Lys Gly He Glu Arg Val Ala 
210 215 220 

Glu Val Ser Ala Arg Phe Ser Leu Asp Gly Met Pro Gly Lys Gin Met 
225 230 235 240 



Ser He Asp Gly Asp Leu Arg Ala Gly Val He Asp Ala Asp His Ala 
245 250 255 
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Arg Thr Leu Arg Gin His Val Gin Gin Glu Ser Arg Phe Leu Gly Ala 
260 265 270 



Met Asp Gly Ala Met Lys Phe Val Lys Gly Asp Thr lie Ala Gly He 
275 280 285 

He Val Val Leu Val Asn He He Gly Gly He He He Ala He Val 
290 295 300 

Gin Tyr Asp Met Ser Met Ser Glu Ala Val His Thr Tyr Ser Val Leu 
305 310 315 320 

Ser He Gly Asp Gly Leu Cys Gly Gin He Pro Ser Leu Leu He Ser 
325 330 335 

Leu Ser Ala Gly He He Val Thr Arg Val Pro Gly Glu Lys Arg Gin 
340 345 350 

Asn Leu Ala Thr Glu Leu Ser Ser Gin He Ala Arg Gin Pro Gin Ser 
355 360 365 

Leu He Leu Thr Ala Val Val Leu Met Leu Leu Ala Leu He Pro Gly 
370 375 380 

Phe Pro Phe He Thr Leu Ala Phe Phe Ser Ala Leu Leu Ala Leu Pro 
385 390 395 400 

He He Leu He Arg Arg Lys Lys Ser Val Val Ser Ala Asn Gly Val 
405 410 415 

Glu Ala Pro Glu Lys Asp Ser Met Val Pro Gly Ala Cys Pro Leu He 
420 425 430 

Leu Arg Leu Ser Pro Thr Leu His Ser Ala Asp Leu He Arg Asp He 
435 440 445 

Asp Ala Met Arg Trp Phe Leu Phe Glu Asp Thr Gly Val Pro Leu Pro 
450 455 460 

Glu Val Asn He Glu Val Leu Pro Glu Pro Thr Glu Lys Leu Thr Val 
465 470 475 480 

Leu Leu Tyr Gin Glu Pro Val Phe Ser Leu Ser He Pro Ala Gin Ala 
485 490 495 

Asp Tyr Leu Leu He Gly Ala Asp Ala Ser Val Val Gly Asp Ser Gin 
500 505 510 

Thr Leu Pro Asn Gly Met Gly Gin He Cys Trp Leu Thr Lys Asp Met 
515 520 525 

Ala His Lys Ala Gin Gly Phe Gly Leu Asp Val Phe Ala Gly Ser Gin 
530 535 540 

Arg He Ser Ala Leu Leu Lys Cys Val Leu Leu Arg His Met Gly Glu 
545 550 555 560 
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Phe lie Gly Val Gin Glu Thr Arg Tyr Leu Met Asn Ala Met Glu Lys 
565 570 575 



Asn Tyr Ser Glu Leu Val Lys Glu Leu Gin Arg Gin Leu Pro lie Asn 
580 585 590 

Lys lie Ala Glu Thr Leu Gin Arg Leu Val Ser Glu Arg Val Ser He 
595 600 605 

Arg Asp Leu Arg Leu He Phe Gly Thr Leu He Asp Trp Ala Pro Arg 
610 615 620 

Glu Lys Asp Val Leu Met Leu Thr Glu Tyr Val Arg He Ala Leu Arg 
625 630 635 640 

Arg His He Leu Arg Arg Leu Asn Pro Glu Gly Lys Pro Leu Pro He 
645 650 655 

Leu Arg He Gly Glu Gly He Glu Asn Leu Val Arg Glu Ser He Arg 
660 665 670 

Gin Thr Ala Met Gly Thr Tyr Thr Ala Leu Ser Ser Arg His Lys Thr 
675 680 685 

Gin He Leu Gin Leu He Glu Gin Ala Leu Lys Gin Ser Ala Lys Leu 
690 695 700 

Phe He Val Thr Ser Val Asp Thr Arg Arg Phe Leu Arg Lys He Thr 
705 710 715 720 

Glu Ala Thr Leu Phe Asp Val Pro He Leu Ser Trp Gin Glu Leu Gly 
725 730 735 

Glu Glu Ser Leu He Gin Val Val Glu Ser He Asp Leu Ser Glu Glu 
740 745 750 



Glu Leu Ala Asp Asn Glu Glu 
755 



(2) INFORMATION FOR SEQ ID NO: 90: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

He Asp Ala Thr Ser Glu Ala Glu He Ser Ala Pro Arg Trp Leu Leu 
15 10 15 

Ser Met Gly Pro Asn Ser Gly Cys Gin Arg Asn Val Val Lys Cys Val 
20 25 30 
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Val Ala Trp Gly He Tyr Gly Arg Val Val Leu Tyr Lys Ala Trp Arg 
35 40 45 



Arg Thr Cys 
50 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Ser Arg Gly Asp 
1 

(2) INFORMATION FOR SEQ ID NO: 92: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 257 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 92 : 

Trp Gin Gin Ser Phe Ala He Ser Phe Tyr Glu Tyr Asn Arg Ala Ser 
15 10 15 

Leu Arg Ala Ala Ser Asp Gly Leu Lys Arg Arg His Gin Val Pro Val 
20 25 30 

Gly Glu Ala Leu Leu Gly Arg Val He Asp Gly Phe Gly Arg Pro Leu 
35 40 45 

Asp Gly Arg Glu Leu Pro Asp Val Cys Trp Lys Asp Tyr Asp Ala Met 
50 55 60 

Pro Pro Pro Ala Met Val Arg Gin Pro He Thr Gin Pro Leu Met Thr 
65 70 75 80 

Gly He Arg Ala He Asp Ser Val Ala Thr Cys Gly Glu Gly Gin Arg 
85 90 95 

Val Gly He Phe Ser Ala Pro Gly Val Gly Lys Ser Thr Leu Leu Ala 
100 105 HO 

Met Leu Cys Asn Ala Pro Asp Ala Asp Ser Asn Val Leu Val Leu He 
115 120 125 

Gly Glu Arg Gly Arg Glu Val Arg Glu Phe He Asp Phe Thr Leu Ser 
130 135 140 
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Glu Glu Thr Arg Lys Arg Cys Val He Val Val Ala Thr Ser Asp Arg 
145 150 155 160 



Pro Ala Leu Glu Arg Val Arg Ala Leu Phe Val Ala Thr Thr He Ala 
165 170 175 

Glu Phe Phe Arg Asp Asn Gly Lys Arg Val Val Leu Leu Ala Asp Ser 
180 185 190 

Leu Thr Arg Tyr Ala Arg Ala Ala Arg Lys Ser Leu Trp Arg Arg Arg 
195 200 205 

Asp Arg Gly Phe Trp Arg He Ser Pro Gly Val Phe Ser Ala Leu Pro 
210 215 220 

Arg Leu Leu Glu Arg Thr Gly Met Gly Glu Lys Gly Ser He Thr Ala 
225 230 235 240 

Phe Tyr Thr Val Leu Val Glu Gly Asp Asp Met Asn Glu Ala Val Gly 
245 250 255 

Gly 

(2) INFORMATION FOR SEQ ID NO: 93; 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: 

Ser Pro Phe Thr Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO: 94: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL- NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Trp Thr Tyr Cys Thr He Pro Thr Ala Cys Arg Glu Gly Ala Leu Ser 
15 10 15 

Cys His 

(2) INFORMATION FOR SEQ ID NO: 95: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Arg Val Gly Asn Ala Gin Pro Arg Phe Ser Ser Arg Tyr Gin Pro 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Ala Ser Ser Thr Gly Gly Asp lie Ala Thr Val Pro Gly Ala Leu Pro 
15 10 15 

Gly Gly 

(2) INFORMATION FOR SEQ ID NO: 97: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Thr Val Asn Thr His Trp Gly lie Pro Ala Arg Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 98: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : S ingle 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Tyr Leu Ser Gly Tyr Leu His lie Phe Ala Thr Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 99: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 59 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
<ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

Ser Met Arg Thr Arg Ala Thr Tyr Arg Lys He Thr Pro Asn Thr His 
15 10 15 

Arg Val He Met Glu Thr Leu Leu Glu He He Ala Arg Leu Lys Ser 
20 25 30 

Asn Tyr Ala Ala Ser Leu Pro Tyr Leu He Ser Ser Asn Arg Arg Leu 
35 40 45 

Leu Arg Asn Ser Arg Phe Ala Arg Arg Ala Leu 
50 55 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Gin Cys Leu Pro Asp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 101: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

Trp Ala Gly Lys Val Arg Tyr Leu Val He Tyr Cys Trp He Arg Asn 
15 10 15 

Asn Lys Trp Pro Gly Tyr Ser Leu Arg Arg Arg Ala Phe 
20 25 

(2) INFORMATION FOR SEQ ID NO: 102: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Arg Asn Gly Lys Gin Leu Glu Asn Gin Tyr Gin Gin Leu Val Ser Arg 
15 10 15 

Arg Ser Glu Leu Gin Lys Asn Phe Asn Ala Leu Met Lys Lys Lys Glu 
20 25 30 

Lys lie Thr Met Val Leu Ser Asp Ala Tyr Tyr Gin Ser 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
{ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

Gly Lys Ser Trp Val Ala Met Pro Val Leu Ser Gly 
15 10 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Arg Gly Gly Gly Gly Thr Tyr Gly Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 105: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

Thr Thr His Ala Pro Gly lie Thr His Trp 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

Ser Ser Cys Ser lie Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 107; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Glu Arg Gly Phe His Ala Thr Leu Ser Cys 
15 10 

(2) INFORMATION FOR SEQ ID NO:108: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108 

Trp Arg Leu Ser 
1 

(2) INFORMATION FOR SEQ ID NO: 109: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109 
Arg Cys Arg Val 



(2) INFORMATION FOR SEQ ID NO: 110; 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

lie Arg Gly Ala Asn Pro Val Lys Asn Gin Cys Pro Ser Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TO POLOGY : 1 i ne ar 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

Asn Leu Pro Phe Asp Glu Ser Ala Lys Ala Val Ala Gly Val Ser Val 
15 10 15 

Ala Ala Tyr Gly Val Tyr Asn Phe Pro Gly Asp lie Leu Cys 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Arg Met Lys Ser Val Arg Gly Trp Arg Tyr Phe Gin Arg Lys Ala Leu 
15 10 15 

Pro Leu Val Ser 
20 

(2) INFORMATION FOR SEQ ID NO: 113: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
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Val Cys Asn Asn lie Gin Tyr Ser Lys Gly His Tyr Leu Pro 
15 10 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:114: 

lie lie lie Met Ser Trp Val Gly Cys Gly Leu Gin Asn Asn Ala Gly 
15 10 15 

Ser Ala Gly Val Lys Gly 
20 

(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

Leu Ala Pro Leu lie Asp Arg Leu Ser He Leu Asn Cys Tyr Met Glu 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 116: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

Leu Asn Gly Gly Trp Arg Arg Tyr Cys Lys Pro Val Met Gin Pro Ser 
15 10 15 

Val Arg Thr Ser Arg Gin His Pro Ala Val He Tyr His He Ser 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 117: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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( D ) TOPOLOGY : 1 i ne ar 
(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

Arg Cys lie Leu Asn Gly Gin Leu Lys Ser Met Ser Ser lie Ala Leu 
15 10 15 

Phe Leu His Gly Gin Arg Val Phe Cys Ala lie 
20 25 

(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Ser Glu Ser Phe Leu Leu Ser Asp Asn Arg Phe lie Leu Pro Leu Leu 
15 10 15 

Trp 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

Ser Leu Tyr lie Gin Ala Gly Ala Ser Leu His 
15 10 

(2) INFORMATION FOR SEQ ID NO: 120: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

Ser Asn Leu Ser Leu Ser Lys Ser Ala Trp Ala Phe Gly Phe He Ala 
15 10 15 

Ser Ala Thr Ser Asp Ser Val Phe Leu Leu Phe Asn Tyr Leu Gly Glu 
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20 



25 



30 



Ser Thr Gin Gly Cys Cys 
35 

(2) INFORMATION FOR SEQ ID NO: 121 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Gin Arg lie Thr Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 122: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Asn Leu Thr Asn 
1 

(2) INFORMATION FOR SEQ ID NO: 123: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Ser Arg lie Ser Lys Arg Tyr Leu Arg Gin Gly Ala Gin Cys Gin Arg 
15 10 15 

Val Thr Glu Arg Leu Gin Ser Asn Leu Ser Arg Tyr His Asn Arg Cys 
20 25 30 

Ser Leu Arg Ser Asp Val Arg Val Trp Lys Leu Asp Asn Tyr Asp Asn 
35 40 45 

Leu Lys Arg Gly Thr Phe Cys Leu 
50 55 

(2) INFORMATION FOR SEQ ID NO: 124: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:124: 

Val Asp Val Leu Arg Gin Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

Met Thr Val Leu Leu Gly Lys Val Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

Leu Pro Val Ala Met Asn Leu Trp Cys Val Leu His Val Gly lie Phe 
15 10 15 

Ala Lys lie Gin Arg Lys Pro Asp Lys Lys Asn Asn Met Arg Thr lie 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 127: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 225 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TO POLOG Y : 1 i ne a r 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Arg Ser Arg Ser Cys His Glu Arg Tyr Ser Met Ser Leu Pro Asp Ser 
15 10 15 
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Pro Leu Gin Leu lie Gly lie Leu Phe Leu Leu Ser lie Leu Pro Leu 
20 25 30 



lie lie Val Met Gly Thr Ser Phe Leu Lys Leu Ala Val Val Phe Ser 
35 40 45 

He Leu Arg Asn Ala Leu Gly He Gin Gin Val Pro Pro Asn He Ala 
50 55 60 

Leu Tyr Gly Leu Ala Leu Val Leu Ser Leu Phe He Met Gly Pro Thr 
65 70 75 80 

Leu Leu Ala Val Lys Glu Arg Trp His Pro Val Gin Val Ala Gly Ala 
85 90 95 

Pro Phe Trp Thr Ser Glu Trp Asp Ser Lys Ala Leu Ala Pro Tyr Arg 
100 105 HO 

Gin Phe Leu Gin Lys Asn Ser Glu Glu Lys Glu Ala Asn Tyr Phe Arg 
115 120 125 

Asn Leu He Lys Arg Thr Trp Pro Glu Asp He Lys Arg Lys He Lys 
130 135 140 

Pro Asp Ser Leu Leu He Leu He Pro Ala Phe Thr Val Ser Gin Leu 
145 150 155 160 

Thr Gin Ala Phe Arg He Gly Leu Leu He Tyr Leu Pro Phe Leu Ala 
165 170 175 

He Asp Leu Leu He Ser Asn He Leu Leu Ala Met Gly Met Met Met 
180 185 190 

Val Ser Pro Met Thr He Ser Leu Pro Phe Lys Leu Leu He Phe Leu 
195 200 205 

Leu Ala Gly Gly Trp Asp Leu Thr Leu Ala Gin Leu Val Gin Ser Phe 
210 215 220 

Ser 
225 

(2) INFORMATION FOR SEQ ID NO: 128: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

Met He Leu Asn 
1 
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(2) INFORMATION FOR SEQ ID NO: 129: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 
<B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 9: 

Arg Asn Phe Tyr Gly Ser Ser Phe Leu Arg Leu Cys Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 0: 

Cys Trp Trp His Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Leu Val Ser Ser 
1 

(2) INFORMATION FOR SEQ ID NO: 132: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Ala Leu Phe Arg Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 133: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

Leu Lys Tyr Arg Thr Lys Arg Tyr Ser Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

Leu Asn Tyr Trp Gin Leu Gin 
1 5 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135 

Trp Ser Ala Thr His Gly Leu Ala Val Ser Cys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 136: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136 

lie He Pro Gly Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 137: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 ine ar 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Cys Tyr Glu Leu Glu Ser Met Val Glu Trp His Asn Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO:138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

Met Ser Gly Leu Leu His Trp Leu Trp Leu Leu Phe Asp His 
15 10 

(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Ala Phe Leu Tyr Tyr Phe Pro Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 140: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 0: 

Lys Val Ala Val 
1 

(2) INFORMATION FOR SEQ ID NO: 141: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

Gly Pro His Phe Tyr Val Met Ala Cys Leu Cys His Leu Pro Phe Arg 
15 10 15 

Tyr Tyr Gin Ser Phe Thr Ser Arg Arg Leu 
20 25 

(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

Cys lie Leu Val Lys lie Thr Val Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:143: 

Ser Leu Glu Arg 
1 

(2) INFORMATION FOR SEQ ID NO: 144: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Leu Leu Val Phe Gin Leu Gly Phe Val Arg Arg Phe Pro Phe Gly Pro 
15 10 15 

Leu lie Trp Arg Gly Phe Cys Leu lie Leu Tyr Val Ala Arg Gin Trp 
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20 



25 



30 



Val Arg Tyr Ser lie Leu Gin 
35 

(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 5: 

Lys Leu Lys Pro His Phe Leu Ala Cys Phe Ser Ala Ser Ser Cys Val 
15 10 15 

Leu Phe Ser Leu 
20 

(2) INFORMATION FOR SEQ ID NO:146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 6: 

Ala Ala Ala Trp Ser Leu Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 147: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:147: 

Thr Phe Cys Met Ser His lie Asn lie Tyr His Gin Gly Val Leu Tyr 
15 10 15 

Tyr Leu Thr Ser Asn Phe 
20 

(2) INFORMATION FOR SEQ ID NO:148: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:148: 

Asn He Ser Arg Gin Ser Gly Glu Arg Phe He Asn Tyr Val Ser Ala 
15 10 15 

Ser Leu Phe Leu Pro 
20 

(2) INFORMATION FOR SEQ ID NO: 149: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 i near 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 
Tyr Val Trp Tyr 



(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Pro He Trp Leu 
1 

(2) INFORMATION FOR SEQ ID NO: 151: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

He Gly Arg His Asn Asn 
1 5 

(2) INFORMATION FOR SEQ ID NO: 152: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:152: 

Met Cys Phe Ser Ser Gin Cys Arg Ser Lys Val Tyr Trp Phe Tyr 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:153: 

Xaa Pro Asp Leu lie Pro Leu Cys Ser Ser Ser Leu Phe Gly 
15 10 

(2) INFORMATION FOR SEQ ID NO: 154: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 225 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

He Leu Tyr Leu Ser Lys Arg Leu Val Ser He Cys Met Ser Glu Lys 
15 10 15 

Thr Glu Gin Pro Thr Glu Lys -Lys Leu Arg Asp Gly Arg Lys Glu Gly 
20 25 30 

Gin Val Val Lys Ser He Glu He Thr Ser Leu Phe Gin Leu He Ala 
35 40 45 

Leu Tyr Leu Tyr Phe His Phe Phe Thr Glu Lys Met He Leu He Leu 
50 55 60 

He Glu Ser He Thr Phe Thr Leu Gin Leu Val Asn Lys Pro Phe Ser 
65 70 75 80 

Tyr Ala Leu Thr Gin Leu Ser His Ala Leu He Glu Ser Leu Thr Ser 
85 90 95 



Ala Leu Leu Phe Leu Gly Ala Gly Val He Val Ala Thr Val Gly Ser 
100 105 HO 
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Val Phe Leu Gin Val Gly Val Val 
115 120 

Lys Ser Glu His lie Asn Pro Val 
130 135 

Leu His Ser Val Val Glu Leu Cys 
145 150 

Leu Ser Leu lie Phe Ala Phe Phe 
165 

Arg Ala Leu Pro Tyr Cys Gly Leu 
180 

Ser Leu lie Lys Trp Leu Trp Val 
195 200 

Val Gly lie Leu Asp Tyr Ser Phe 
210 215 



He Ala Ser Lys Ala He Gly Phe 
125 

Ser Asn Phe Lys Gin He Phe Ser 
140 

Lys Ser Ser Leu Lys Val He Met 
155 160 

Phe Tyr Tyr Tyr Ala Ser Thr Phe 
170 1V5 

Ala Cys Gly Val Leu Val Val Ser 
185 190 

Gly Val Met Val Phe Tyr He Val 
205 

Gin Tyr Tyr Lys He Arg Lys Ala 
220 



He 
225 



(2) INFORMATION FOR SEQ ID NO: 155: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 



Val Lys Met Thr 
1 



(2) INFORMATION FOR SEQ ID NO: 156: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 i near 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Asn Arg Ser He Lys He Trp Arg Ala Thr Leu Lys 
15 10 



(2) INFORMATION FOR SEQ ID NO:157: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

Arg Arg Gly Val Gly Asn Ala Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL; NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

Asn Thr Lys Trp Glu Phe Ser Ser lie Cys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

Thr lie Cys Cys Gly Ser Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO; 160: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: -25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

Ser Asn Ala Tyr Cys Gly Leu Ser Trp Leu Ser Ser His Arg Tyr Ala 
15 10 15 

Asn Thr Thr Arg Pro Gly Lys Arg Gin 
20 25 

(2) INFORMATION FOR SEQ ID NO: 161 : 
(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

Thr Gin Leu His Pro Arg Cys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 162: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

Ala Gly Pro Leu lie lie Phe 
1 5 

(2) INFORMATION FOR SEQ ID NO: 163: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 3: 

Ser Gly Thr Arg Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 164: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 

Thr Arg Cys Ser Leu Val Thr Tyr Gly Asp Glu Asp Arg Leu Cys Ala 
15 10 15 

Phe Tyr Arg Asn Thr lie Asn Ala Phe Gly Met Leu Leu Gin Ala Thr 
20 25 30 
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Ala Lys Val Lys Arg Val lie Ala Tyr Arg Ala Val Leu Asp Asp Lys 
35 40 45 



Gly Glu Arg Leu Lys lie lie Ala Phe Ser Leu Ala Gin Ala Pro Asp 
50 55 60 

Ser Val Leu 
65 

(2) INFORMATION FOR SEQ ID NO: 165: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

Trp lie Gly Ala Ser Glu Trp Thr Arg Thr Thr Arg Pro Pro Pro Cys 
15 10 15 

Gin Gly Gly Ala Leu Thr Asn 
20 

(2) INFORMATION FOR SEQ ID NO: 166: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

Ala Met Asn Gly Asn Val Val Gly Asp Asn Gly Asp Glu Tyr 
15 10 

(2)" INFORMATION FOR SEQ ID NO:167: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 

Arg His Asn Arg Asn Glu Ala Arg Gly Lys Ser Gin Phe Ser Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 168: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

68 



(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

Asn His Leu lie Ala Val Glu lie Cys Asn Met Ser Arg Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

Pro Pro Cys Asp Gly Tyr Arg Arg lie lie Gly Ala Arg Cys Lys Met 
15 10 15 

Met Ala Asp Gly 
20 

(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

Arg lie Arg Ser lie lie Asn Thr Ala Ala Glu Val Arg Pro Met Met 
15 10 15 

Asn Pro lie Gin Lys Pro Ala Gly Pro lie Arg Ser Thr Thr Lys Ser 
20 25 30 

Val Asn Ala Arg lie 
35 

(2) INFORMATION FOR SEQ ID NO: 171: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 



Pro Leu Gly Lys Pro Asn Thr Gin 
1 5 

(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

Ala Val Lys Val He Lys Lys Met Glu Arg Val Ser Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 173: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:173: 

Pro Arg Arg lie Pro Leu Pro He Thr Cys He Glu Ser Glu He Trp 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 174: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 

Thr Ala Ala Ser Ser He Asn Cys Gly Lys Arg His Asp Leu Arg Val 
15 10 15 

Val He Val Glu Gin Ser Asn Met Leu Thr Gin Ser Asn Gly Lys Asn 
20 25 30 

Ser Gly Asn His Ser His Thr Asn Ala Asp Ala 
35 40 

(2) INFORMATION FOR SEQ ID NO: 175: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 

Thr Gly Thr Arg Cys Val Cys Ala Ser Ser Val Glu Pro Trp Pro Arg 
15 10 15 

Pro lie Thr His Ser Asn Arg Tyr Arg Arg Ser Gin Arg His Arg Gin 
20 25 30 

Tyr Glu His Gin Arg Ala Lys Val Lys Arg Asn Leu Met Thr Gly Asp 
35 40 45 

lie His Asn Thr 
50 

(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

Trp Arg Asn Gin Gin Arg Asn Asp Arg Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO:177: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

Arg His Phe Lys Glu Gin Pro Ala Gin Ser Ala Thr Pro Val Glu Ser 
15 10 15 

Gly Ala Ser 

(2) INFORMATION FOR SEQ ID NO: 178: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 



Arg Arg Tyr Arg Val Cys Gin Ser Leu Phe His Tyr Glu Tyr His Ala 
15 10 15 

Leu Asn Ala Arg Val 
20 

(2) INFORMATION FOR SEQ ID NO: 179: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

Cys Lys Lys Ala Trp Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 

Thr Ser Pro Asn Arg Pro Pro Gin Ser Gin Arg Arg Ser Arg Tyr Arg 
15 10 15 

Arg Val Pro Ala Tyr Gin Asn Gly His Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO: 181: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 

Lys Tyr Ser Ser Pro Glu Tyr Ser Pro Ala Gly Pro Lys lie Pro Ser 
15 10 15 

Pro Tyr Pro Val Trp Phe Trp Pro Asp Leu Arg Thr Gly Phe Ala Leu 
20 25 30 
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Pro Glu Arg Lys Gly He Leu Arg Pro Thr Ala Ala Arg Glu Asp Asn 
35 40 45 



Pro Arg Leu Tyr Arg Pro Ala Pro Asp Gin Tyr Tyr Ala 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 

Cys He Arg His Ser Thr Gly Arg Ser Ser Pro Ala Arg Arg Gin Ser 
15 10 15 

Pro Pro Ala Arg Thr Leu Val Glu Pro Asp Ala Arg Tyr Ala His Asp 
20 25 30 

Gly Arg Ser His 
35 

(2) INFORMATION FOR SEQ ID NO: 183: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 3: 

Val Ala Gin Ser Gin Ala Ser Arg Pro Ala Val Ser Arg Asp Gin Thr 
15 10 15 

Lys Trp Arg Glu Ala Asp Arg Gly Ala Asp Ser Asp Gly Ser His Val 
20 25 30 

Arg Ser Ala lie Ala Ser Arg His Asp Gly He Asp Glu Ser He Ala 
35 40 45 

Val Tyr Thr Thr Cys Ala Arg He Thr Gly He 
50 55 

(2) INFORMATION FOR SEQ ID NO: 184: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 



Thr Leu lie Thr Asp Ala Leu His Trp Tyr Thr Ser Ala Arg lie His 
15 10 15 

Leu Leu Phe Cys Cys Tyr Met Lys Asp 
20 25 

(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

Lys Ala Ala Glu Val Ala Ala Lys Arg Asn Ser Arg Gly Asn Phe Ser 
15 10 15 

Leu Leu 

(2) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

Arg Gly lie Thr lie Ser Pro Val Lys Lys Gin Leu Leu Thr Ala His 
15 10 15 

Cys Trp Gin Ala Val Phe Pro Pro Ala lie Val Leu Asn Ser Ser Ala 
20 25 30 

Phe lie Tyr Phe Arg Ser 
35 

(2) INFORMATION FOR SEQ ID NO: 187: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

Arg Tyr Val Tyr Gly Asp Arg Thr Gly Tyr Arg Glu Thr Gly lie Asp 
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(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

Cys Arg Thr Glu Pro Gly Ala Asn 
1 5 

(2) INFORMATION FOR SEQ ID NO: 189: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

Thr lie Ala Glu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 190: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

Ser lie His Gin 
1 

(2) INFORMATION FOR SEQ ID NO: 191: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

Arg Gin Cys Phe lie Thr Ala Gly Tyr Val Asp Gin Thr Asn Gly Leu 
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Tyr Ala Val Asn Gly Arg Arg Arg Leu Ser Cys Gin Asn lie Thr Ala 
20 25 30 

Gin His Ala Lys Arg Leu lie Ser Trp lie Thr Trp His Glu Gly Ser 
35 40 45 



Ser Tyr Ser lie Ser Tyr Cys Pro Tyr Val Leu Ser Tyr Gly Met 
50 55 60 



(2) INFORMATION FOR SEQ ID NO: 192: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

Cys Gly Ser Leu Ser Leu lie Ala Arg Arg 
15 10 



(2) INFORMATION FOR SEQ ID NO: 193: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

Ser Glu Ser Asn Ala Gly lie Thr Tyr Ala Ala Ser Tyr 
15 10 



(2) INFORMATION FOR SEQ ID NO: 194: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : 1 inear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

Cys Glu Lys Lys Gin Glu Glu Asp Gly Val Thr Leu Arg Val Glu Gin 
15 10 15 



Ser Ala Val Tyr 
20 



(2) INFORMATION FOR SEQ ID NO: 195: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 

Gly Tyr Leu Asp Leu Thr Val He Arg He Gly Gin Phe Thr Thr Ala 
1 5 10 15 

Asp Lys Met Phe Pro Ala Asn Gin Leu Val Val Ser Pro Gin Glu Glu 
20 25 30 

Gin Ala Glu Asp 
35 

(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:196: 

Phe Phe Lys Arg Thr Lys Asn 
1 5 

(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

Arg Asn Ala Glu Ser Asp Gly Gly Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO: 198: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 
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Leu Met Ala Lys Val Thr He Ala Leu Pro Thr Tyr Asp Glu Gly Ser 
15 10 15 

Asn Ala Ser Pro Ser Ser Val Ala Val Phe He Lys Tyr Ser Pro Gin 
20 25 30 

Val Asn Met Glu Ala Phe Arg Val Lys He Lys Asp Leu He Glu Met 
35 40 45 

Ser He Pro Gly Leu Gin Tyr Ser Lys He Ser He Leu Met Gin Pro 
50 55 60 

Ala Glu Phe Arg Met Val Ala Asp Val Pro Ala Arg Gin Thr Phe Trp 
65 70 75 80 

He Met Asp Val He Asn Ala Asn Lys Gly Lys Val Val Lys Trp Leu 
85 90 95 

Met Lys Tyr Pro Tyr Pro Leu Met Leu Ser Leu Thr Gly Leu Leu Leu 
100 105 HO 

Gly Val Gly He Leu He Gly Tyr Phe Cys Leu Arg Arg Arg Phe 
115 120 125 



(2) INFORMATION FOR SEQ ID N0:199: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 8 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 

Ala Asp Leu He Pro Arg Cys Cys Asn Phe He Val He Ser Gly Asn 
15 10 15 

Leu Leu Val Thr Leu Tyr Arg Asn Gly Trp He Ser Trp Ala Phe He 
20 25 30 

Phe Lys Leu Leu Ala Leu Trp Arg Ser Ala Arg Val Gly Ser Ser Ser 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 2 00: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:200: 



Gin Ser Val Lys 
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(2) INFORMATION FOR SEQ ID NO: 2 01: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 

Thr Lys Arg Lys Leu Cys Tyr Ser Ser Leu Phe Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 

Gin Ala Lys Thr Ala Gly Ser Ser Cys Ala Ala Tyr lie Gly He Cys 
15 10 15 

Leu Trp Ala Gly He He Gin Thr Gin Val 
20 25 

(2) INFORMATION FOR SEQ ID NO: 2 03: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:203: 

Leu Phe Tyr Ala Thr Arg Val Ser Ala He Ala Ser Thr Val Val 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 2 04: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 
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Asp Leu Ala Ala lie Trp Leu Val Gly Ala Lys Arg Trp Gin lie Thr 
15 10 15 



Ser Ser Ala Ser Asp Ala Thr Asn Cys He Ala Asp Arg Tyr Arg His 
20 25 30 

Ser 

(2) INFORMATION FOR SEQ ID NO: 205: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE : amino acid 

<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 
Ser Gly Ser Ala 



(2) INFORMATION FOR SEQ ID NO:206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:206: 

Arg Cys Gly Phe Tyr Met Arg Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 2 07: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 

Tyr Tyr Tyr Pro Leu Arg Ser Val Tyr Phe Gly Arg Arg Leu Leu Leu 
15 10 15 

Pro Arg Leu Ser Ser Trp Ser He Cys Tyr Glu Phe Tyr Phe Thr Ser 
20 25 30 

Ser Asp Gly Asn 
35 
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(2) INFORMATION FOR SEQ ID NO:208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 

Ala Thr Arg Ser Lys Tyr Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO:209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:209: 

Val Thr Val Asp Asn lie Thr lie Asn Phe lie Cys Ala Arg Ala Thr 
15 10 15 

Ser 

(2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 i ne ar 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 

Glu Ser Phe Thr Cys Tyr Cys Glu Leu Arg Leu Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 2 11: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 



Lys Asn His Pro Arg Arg Leu Ser Leu Ser Ala 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 2 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 

Ala Ala Thr Arg Thr Ser Val Leu Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO:213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:213: 

Lys Tyr Ala Gly Lys Asn Gly Ser Gly Met Ala Gly Thr Ala Cys Lys 
15 10 15 

Thr Phe Thr Arg Arg 
20 

(2) INFORMATION FOR SEQ ID NO: 2 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:214: 

Lys Ser lie Ser Phe lie Gly Arg Ser Arg Ser Ala Ser Tyr 
15 10 

(2) INFORMATION FOR SEQ ID NO:215: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 
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Tyr Arg Thr Gly Ser Val Gly Leu Val Arg Pro Thr Val Gly Arg Gin 
15 10 15 



Cys Tyr Val Pro Ser Ser Gly Thr Pro Gly His Gly Tyr Gly Gly Arg 
20 25 30 

Gly Ser Ala Leu Phe Ala Tyr Ser Ser 
35 40 

(2) INFORMATION FOR SEQ ID NO: 216 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
<ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216: 

Lys Arg Gly He Asp Ala Arg Asn Phe Trp Gin Ala Val Tyr Val Asp 
15 10 15 

Tyr Arg Ala Trp Phe Leu Ser Arg Ser Gly 
20 25 

(2) INFORMATION FOR SEQ ID NO: 2 17: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 

Thr Phe Leu Asn Thr He Cys Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 2 18: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218: 

He Phe Thr Phe Ser Ser Phe Gin Arg Val Thr Glu Met Val Thr 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 219: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 6 amino acids 
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(B) TYPE; amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:219: 

lie Leu Lys Leu Met Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO:220: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 11 amino acids 

(B) TYPE : amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 

Arg Pro Leu Gin His Leu Pro Leu Ala Arg Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 221: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 i near 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 

Arg Asn Lys Arg Leu Phe Arg Leu Gin Cys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 222: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222: 

Leu Ser Arg Ser Tyr Arg Lys Arg Arg Gly Gin Arg Ser Met Arg Arg 
15 10 15 

Trp Lys Lys 

(2) INFORMATION FOR SEQ ID NO: 223: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:223: 

Val Val Asn Cys Ala Lys lie lie Asn Ser Leu Met Leu Arg Asn Trp 
15 10 15 

Ser Ala Asp Ser Arg Leu Cys Cys Val 
20 25 

(2) INFORMATION FOR SEQ ID NO: 224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 

(B) TYPE: amino acid 

{ C) STRANDEDNESS : s ingle 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 

Asn Lys Tyr Arg Arg lie Met Gly Gin Arg Cys Val Arg Leu Pro Lys 
15 10 15 

Arg lie Val lie Leu lie Tyr Arg Met Arg lie Lys Leu Ser Leu Leu 
20 25 30 

Gin Trp Arg Leu Leu Pro Ala Gly Cys Gin Lys Arg Lys Asn Ala lie 
35 40 45 

Cys Asn Arg Asn Trp He Arg Tyr Ser Gly Gly Gly Met Gly Thr Cys 
50 55 60 

Arg Phe 
65 

(2) INFORMATION FOR SEQ ID NO: 225: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 178 amino acids 

(B) TYPE: amino acid 

(G) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225: 

Phe Thr Gly Thr Trp Arg Ser Gly Tyr Arg Thr Leu Ser Ser Leu Lys 
15 10 15 



Arg Phe Met Gin Gin Ala He Asp Asn Asp Glu Met Pro Leu Ser Gin 
20 25 30 
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Trp Phe Arg Arg Val Ala Asp Trp Pro Asp Arg Cys Glu Arg Val Arg 
35 40 45 



lie Leu Leu Arg Ala Val Ala Phe Glu Leu Ser lie Cys lie Glu Pro 
50 55 60 

Ser Glu Gin Ser Arg Leu Ala Ala Ala Leu Val Arg Leu Arg Arg Leu 
65 70 75 80 

Leu Leu Phe Leu Gly Leu Glu Lys Glu Cys Gin Arg Glu Glu Trp lie 
85 90 95 

Cys Gin Leu Pro Pro Asn Thr Leu Leu Pro Leu Leu Leu Asp lie lie 
100 105 110 

Cys Glu Arg Trp Leu Phe Ser Asp Trp Leu Leu Asp Arg Leu Thr Ala 
115 120 125 

lie Val Ser Ser Ser Lys Met Phe Asn Arg Leu Leu Gin Gin Leu Asp 
130 135 140 

Ala Gin Phe Met Leu lie Pro Asp Asn Cys Phe Asn Asp Glu Asp Gin 
145 150 155 160 

Arg Glu Gin lie Leu Glu Thr Leu Arg Glu Val Lys lie Asn Gin Val 
165 170 175 

Leu Phe 

(2) INFORMATION FOR SEQ ID NO: 226: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : S ingle 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:226: 

Tyr Leu Ala Phe Asn lie 
1 5 

(2) INFORMATION FOR SEQ ID NO: 2 27: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:227: 



Val Asn Trp Leu Ser Gly Ser Ser 
1 5 
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(2) INFORMATION FOR SEQ ID NO:228: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 228: 



Gly Val Arg Met Asp Trp Asp Leu 
1 5 

Phe lie Gin Leu Ala Gly Leu Ala 
20 

Phe Trp Arg Gin Gly Gin Tyr Glu 
35 40 

Tyr Val Arg Tyr Ser Ser Lys Pro 
50 55 



lie Thr Glu Arg Asn lie Gin Leu 
10 15 

Glu Arg Pro Leu Ala Thr Asn Met 
25 30 

Thr lie lie Thr Val Val Phe Ser 
45 

Ser 



(2) INFORMATION FOR SEQ ID NO: 229: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 9: 

Thr Lys Asn Cys Phe Leu Lys Arg Trp Leu Thr Gly Asn Pro Gin Arg 
15 10 15 

Ser Arg Val Phe Leu Asn Asp Tyr Phe Cys Cys Ala Met Gly Leu Gin 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 230: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 0: 

Val Val Leu His Leu Phe Pro Ala Pro Pro Ser Ser Gly Tyr Asp Tyr 
15 10 15 



lie He Asp Lys 
20 
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(2) INFORMATION FOR SEQ ID NO: 2 31: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
_(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 231: 

Asn Phe Xaa Glu Ser Gin Cys Val His Gly 
15 10 

(2) INFORMATION FOR SEQ ID NO: 232: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 232: 

Val Arg Glu Ser Gly Arg Asn Ser Gly Ser Val Tyr Ala Arg Val Gly 
15 10 15 

Arg lie Trp Phe Trp Arg Arg Cys Tyr 
20 25 

(2) INFORMATION FOR SEQ ID NO: 233: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 233: 

Cys Cys Tyr Pro Cys Arg Pro Gly Trp Leu lie Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO:234: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:234: 
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Leu Leu Ser Thr Leu Cys Phe Gin 
1 5 



(2) INFORMATION FOR SEQ ID NO: 2 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 5: 

Leu Leu Phe He Leu Val Thr Leu Ser He Tyr Arg Tyr Phe Arg Leu 
15 10 15 

Tyr Tyr Leu Leu Leu His Tyr He Val Cys His Ser Gin Ser Ala His 
20 25 30 

His Gly Trp Tyr Cys Tyr Asn He Met Pro Val He Leu Trp Met Leu 
35 40 45 

Ser Val Ser Leu Ser 
50 

(2) INFORMATION FOR SEQ ID NO:236: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 236: 

Glu Glu He Ser Pro Leu Gly Trp Ser Tyr Leu Pro Ser Leu Leu Ser 
15 10 15 

Cys Asn Leu Leu Ser Leu Gin Lys Val Ser Arg Gly Trp Arg Lys Leu 
20 25 30 

Ala His Val Ser Arg Leu Met Gly Cys Gin Ala Asn Lys 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 237: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TO POLOG Y : linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 237: 



89 



Val Ser Met Ala lie Cys Val Pro Glu Leu Ser Met Gin Thr Met Pro 
15 10 15 



Val His 

(2) INFORMATION FOR SEQ ID NO: 238: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 238: 

Asp Ser Met Ser Ser Arg Lys Ala Ala Phe Ser Val Arg Trp Thr Val 
15 10 15 

Arg 

(2) INFORMATION FOR SEQ ID NO: 239: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 9: 

Asn Leu Leu Lys Ala lie Arg Leu Pro Val Leu Leu Leu Phe Trp 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 240: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 0: 

Thr Leu Ser Ala Val Ser Leu Ser Leu Ser Tyr Asn Met lie Cys Arg 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 241: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241: 



Val Arg Leu Phe Thr Leu He Ala Tyr Cys Gin Ser Glu Met Val Tyr 
15 10 15 

Val Gly Lys Phe His Arg Cys 
20 

(2) INFORMATION FOR SEQ ID NO: 242: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: 

Phe Pro Leu Ala Arg Glu Leu Leu Ser Pro Val Ser Arg Val Arg Asn 
15 10 15 

Ala Arg Thr Trp Arg Gin Ser 
20 

(2) INFORMATION FOR SEQ ID NO:243: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:243: 

Val Leu Lys Leu Pro Asp Asn Leu Ser Arg Ser Tyr 
15 10 

(2) INFORMATION FOR SEQ ID NO: 244: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 244: 

Pro Leu Trp Phe 
1 

(2) INFORMATION FOR SEQ ID NO: 245: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : 1 inear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:245: 
Cys Ser Ser Leu 



(2) INFORMATION FOR SEQ ID NO: 246: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 246: 

Phe Leu Ala Phe Leu Leu Ser Leu Ser Leu Ser Phe Gin Arg Cys 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 247: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : S ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 247: 

His Cys Gin Leu Ser Ser Phe Ala Ala Lys Ser Leu Trp Phe Pro Gin 
15 10 15 

Met Ala Ser Lys His Arg Lys Lys He Val Trp Phe Pro Ala His Val 
20 25 30 

Leu 

(2) INFORMATION FOR SEQ ID NO: 248: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:248: 

Ser Tyr Val Leu Ala Arg Arg Tyr He Leu Pro Thr 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 249: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:249: 

Phe Val lie Leu Thr Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 250: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 250: 

Asp Gly Phe Tyr Leu Arg lie Pro Ala Ser Leu Ser Leu Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 2 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 251: 

lie Leu Arg Phe Cys Leu Asn Pro Pro Lys Asn 
15 10 

(2) INFORMATION FOR SEQ ID NO:252: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:252: 

Arg Tyr Cys Tyr He Arg Asn Pro Tyr Leu Val Tyr Leu Phe Pro Leu 
15 10 15 
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Arg Arg lie lie Tyr 
20 



(2) INFORMATION FOR SEQ ID NO: 253: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:253: 

Ala Arg Thr Leu Val Trp Trp Val Thr Ala Arg Arg Tyr Arg Thr Gly 
15 10 15 

Trp Gly Arg Ser Val Gly Leu Gin Lys Thr Trp Pro lie Arg Arg Lys 
20 25 30 

Val Leu Asp Trp Thr Phe Ser Arg Ala Ala Asn Val Ser Leu Pro Tyr 
35 40 45 

(2) INFORMATION FOR SEQ ID NO:254: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 254: 

Asn Val Ser Cys Phe Gly lie Trp Glu Ser Leu Leu Val Phe Arg Lys 
15 10 15 

Arg Val lie 

(2) INFORMATION FOR SEQ ID NO: 255: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25 5: 

Met Arg Trp Lys Lys Thr Thr Leu Ser Trp 
15 10 

(2) INFORMATION FOR SEQ ID NO: 256: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:256: 

Lys Ser Phe Ser Ala Ser Tyr Pro Leu He Lys Ser Leu Lys Leu Cys 
15 10 15 

Asn Gly Leu Tyr Gin Ser Gly Phe Leu Leu Glu He Tyr Val Leu Phe 
20 25 30 

Ser Ala Pro 
35 

(2) INFORMATION FOR SEQ ID NO: 257: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 257: 

Leu Thr Gly Arg His Val Lys Lys Met Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 258: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 66 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:258: 

Gin Asn Met Ser Val Ser Arg Phe Val Val He Phe Cys Val Val Leu 
15 10 15 

He Arg Lys Glu Asn Arg Cys Arg Phe Cys Gly Ser Ala Lys Val Leu 
20 25 30 

Lys Thr Ser Cys Val Asn Pro Phe Ala Arg Arg Gin Trp Gly Pro He 
35 40 45 

Leu Arg Cys Arg Leu Val He Arg Arg Arg Ser Cys Asn Leu Ser Ser 
50 55 60 

Arg Arg 
65 

(2) INFORMATION FOR SEQ ID NO: 2 59: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 259: 

Ser Ser Gin Pro Asn Tyr Ser Leu Ser Leu Leu Ser Thr Pro Asp Val 
15 10 15 

Ser Cys Glu Lys Leu Gin Lys Pro Pro Cys Ser Thr Tyr Arg Phe Cys 
20 25 30 

His Gly Arg Asn 
35 

(2) INFORMATION FOR SEQ ID NO: 260: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 0: 

Glu Arg Arg Ala Leu Tyr Lys Trp 
1 5 

(2) INFORMATION FOR SEQ ID NO:261: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 109 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:261: 

Lys Val Leu Thr Leu Ala Lys Arg Ser Trp Arg Thr Met Lys Asn Glu 
15 10 15 

Leu Met Gin Arg Leu Arg Leu Lys Tyr Pro Pro Pro Asp Gly Tyr Cys 
20 25 30 

Arg Trp Gly Arg He Gin Asp Val Ser Ala Thr Leu Leu Asn Ala Trp 
35 40 45 

Leu Pro Gly Val Phe Met Gly Glu Leu Cys Cys He Lys Pro Gly Glu 
50 55 60 

Glu Leu Ala Glu Val Val Gly He Asn Gly Ser Lys Ala Leu Leu Ser 
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65 



70 



75 



80 



Pro Phe Thr Ser Thr lie Gly Leu His Cys Gly Gin Gin Val Met Ala 
85 90 95 

Leu Ser Asp Ala He Arg Phe Pro Trp Ala Lys Arg Tyr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 262: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:262: 

Gly Glu Leu Leu Met Ala Leu Val Val Pro Leu Met Ala Ala Asn Cys 
15 10 15 

Pro Thr Ser Ala Gly Lys Thr Met Met Gin Cys Leu Leu Pro Gin Trp 
20 25 30 

Phe Asp Ser Leu Ser Leu Asn His 
35 40 

(2) INFORMATION FOR SEQ ID NO: 263: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 263: 

Arg Gly Phe Ala Leu Leu He Ala Leu Arg Pro Val Ala Lys Gly Asn 
15 10 15 

Glu Trp Val Phe Phe Leu Leu Leu Ala Trp Gly Lys Ala Arg Phe Trp 
20 25 30 

Arg Cys Cys Val Met Arg Gin Thr Gin Thr Ala Met Phe Trp Cys 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 264: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:264: 



Leu Val Asn Val Asp Glu Lys Ser Ala Asn Ser Ser lie Leu His Cys 
15 10 15 

Leu Lys Arg Pro Glu Asn Val Val Ser Leu Leu Ser Gin Pro Leu Thr 
20 25 30 

Asp Pro Pro 
35 

(2) INFORMATION FOR SEQ ID NO: 265: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 5 : 

Gly Arg Cys Leu Trp Pro Pro Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO:266: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 

Gin Asn Phe Phe Ala lie Met Glu Ser Glu Ser Ser Cys Leu Pro Thr 
15 10 15 

His 

(2) INFORMATION FOR SEQ ID NO: 2 67: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 7: 

Arg Val Met Pro Gly Pro His Gly Asn Arg Ser Gly Ala Gly Glu Thr 
15 10 15 

Ala Val Ser Gly Glu Tyr Arg Gin Ala Tyr Leu Val His Cys His Asp 
20 25 30 
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Phe 



(2) INFORMATION FOR SEQ ID NO:268: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
<ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 8: 

Asn Val Arg Glu Trp Glu Lys Lys Ala Val Leu Pro His Phe He Arg 
15 10 15 

Tyr Trp Trp Lys Ala Met He 
20 

(2) INFORMATION FOR SEQ ID NO: 269: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 269: 

Met Lys Pro Leu Ala Asp Glu Val Arg Ser Leu Leu Asp Gly His He 
15 10 15 

Val Leu Ser Arg Arg Leu Ala Glu Arg Gly His Tyr Pro Ala He Asp 
20 25 30 

Val Leu Ala Thr Leu Ser Arg Val Phe Pro Val Val Thr Ser His Glu 
35 40 45 

His Arg Gin Leu Ala Ala He Leu Arg Arg Cys Leu Ala Leu Tyr Gin 
50 55 60 

Glu Val Glu Leu Leu He Arg He Gly Glu Tyr Gin Arg Gly Val Asp 
65 70 75 80 

Thr Asp Thr Asp Lys Ala He Asp Thr Tyr Pro Asp He Cys Thr Phe 
85 90 95 

Leu Arg Gin Ser Lys Asp Glu Val Cys Gly Pro Glu Leu Leu He Glu 
100 105 110 

Lys Leu His Gin He Leu Thr Glu 
115 120 

(2) INFORMATION FOR SEQ ID NO: 270: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 270: 

Ser Trp Lys Leu Cys Trp Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 271: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 271: 

Lys Ala lie Thr Arg Gin Ala Tyr Arg Thr 
15 10 

(2) INFORMATION FOR SEQ ID NO: 272: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:272: 

Ser Ala Ala Thr Gly Asp Tyr Tyr Gly Thr Ala Asp Leu Pro Asp Ala 
15 10 15 

Arg Phe Ser Ser Val Tyr Gin Thr Glu Arg He Asn Gly Leu Ala Arg 
20 25 30 

Tyr Val He Leu Ser Phe He Val Gly 
35 40 

(2) INFORMATION FOR SEQ ID NO: 273: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:273: 
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Glu Thr Thr Asn Gly Arg Val He His Ser Gly Ala Glu Leu Phe Asp 
15 10 15 



Ala Thr Ala Ser Ser 
20 

(2) INFORMATION FOR SEQ ID NO: 274: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:274: 

Arg He Ser He Ser Ser Leu Ser Pro Gly Glu Ala Asn Tyr Arg Arg 
15 10 15 

He Leu Met Arg Leu 
20 

(2) INFORMATION FOR SEQ ID NO:275: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 275: 

Lys Arg Lys Lys Lys Leu Leu Trp Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO:276: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 276: 

Ala Met Arg He Thr Lys Val Glu Gly Ser Leu Gly Leu Pro Cys Gin 
15 10 15 

Ser Tyr Gin Asp Asp Asn Glu Ala Glu Ala Glu Arg Met Asp Phe Glu 
20 25 30 

Gin Leu Met His Gin Ala Leu Pro He Gly Glu Asn Asn Pro Pro Ala 
35 40 45 
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Ala Leu Asn Lys Asn Val Val Phe Thr Gin Arg Tyr Arg Val Ser Gly 
50 55 60 



Gly Tyr Leu Asp Gly Val Glu Cys Glu Val Cys Glu Ser Gly Gly Leu 
65 70 75 80 

He Gin Leu Arg He Asn Val Pro His His Glu He Tyr Arg Ser Met 
85 90 95 

Lys Ala Leu Lys Gin Trp Leu Glu Ser Gin Leu Leu His Met Gly Tyr 
100 105 HO 

He He Ser Leu Glu He Phe Tyr Val Lys Asn Ser Glu 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 277: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 277: 

Arg Ala Ser Val Gly Gly Asp Thr Ser Asn Ala Arg Arg Tyr His Trp 
15 10 15 

(2) INFORMATION FOR SEQ ID NO:278: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:278: 

Ala Asp He Glu Tyr Ala Thr He Ser Ser Thr Ala Arg Asp He He 
15 10 15 

Tyr His Lys Leu Ser 
20 

(2) INFORMATION FOR SEQ ID NO: 279: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:279: 
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Gly Val Asp Cys Arg Thr Met Leu Ala Ala Leu Val 
15 10 



(2) INFORMATION FOR SEQ ID NO: 280: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:280: 

Arg Ala Asn Trp His Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 281: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 281: 

Ser lie Gly Tyr Arg Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 2 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : s ingle 

(D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:282: 

lie Ala lie Trp Asn Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 2 83: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TO POLOGY : 1 i ne a r 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:283: 

Met Gly Ala Gly Ala Val He Ala Ser Gin 
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(2) INFORMATION FOR SEQ ID NO: 284: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 284: 

Cys Asn Pro Leu Ser Glu Arg Ala Ala Asn lie Leu Gin 
15 10 

(2) INFORMATION FOR SEQ ID NO: 2 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 285: 

Ser Thr Thr Ser Ala Ser Val Ala Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 28 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 286: 

His Tyr Phe Tyr Met Ala Asn Gly Phe Phe Ala Gin Tyr Ser Arg Arg 
15 10 15 

Ala Phe Cys 

(2) INFORMATION FOR SEQ ID NO:287: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 287: 
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Ala Thr Thr Asp Leu Ser Cys Pro Ser Cys Gly Ser Pro Cys lie Phe 
15 10 15 



Arg Leu Val Pro Ala Tyr lie Asn Arg Thr 
20 25 

(2) INFORMATION FOR SEQ ID NO: 288: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:288: 

Val Tyr Arg Asn Arg His Gly Arg Ser Asp Ser Leu Leu Arg Arg His 
15 10 15 

Gin Thr Arg Phe Phe Cys Tyr Ser Thr Thr Trp Gly Asn Leu Arg Lys 
20 25 30 

Gly Val Ala Asp Arg Gly 
35 

(2) INFORMATION FOR SEQ ID NO: 28 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:289: 

His Asp Glu lie 
1 

(2) INFORMATION FOR SEQ ID NO:290: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
• (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:290: 

Arg lie Ser Pro Gly Tyr Arg Asn Ala Thr Cys Val Arg Glu Pro Asn 
15 10 15 

Val Lys Glu 

(2) INFORMATION FOR SEQ ID NO: 2 91: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 291: 

Arg Asn Val Phe Ser Arg Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 2 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 292: 

Ala Asp Thr Thr Thr Gly Ala Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO:293: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:293: 

Gly Arg Thr Cys Glu Ser Gly Asn Trp Thr He Thr Thr Thr 
15 10 

(2) INFORMATION FOR SEQ ID NO: 294: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:294: 

Asn Gly Gly Arg Phe Ala Cys Arg Trp Met Phe Cys Ala Arg Gly Asp 
15 10 15 



Asp Lys Ser Lys 
20 
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(2) INFORMATION FOR SEQ ID NO:295: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL; NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 295: 

Pro Tyr Tyr Trp Ala Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO:296: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

{ C) STRANDEDNESS : s ingle 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:296: 

Val Asp Cys Leu Trp Gin 
1 5 

(2) INFORMATION FOR SEQ ID NO: 2 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 297: 

lie Tyr Gly Ala Tyr Tyr Thr Leu Val Ser Leu Gin Lys Tyr Ser Val 
15 10 15 

Asn Leu lie Arg Lys He He Cys Glu Gin Tyr Asn Ser Val Pro Gly 
20 25 30 

Arg Val Met Arg Asp Thr Val Cys Leu Tyr Pro He Arg Leu Cys Asn 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 2 98: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 298: 



Leu Val Tyr Cys Phe Cys Phe Gin Tyr Cys Leu Ser Leu Ser Ser Trp 
15 10 15 

Glu Leu Leu Ser Leu Asn Trp Arg Trp Tyr Phe Arg Phe Tyr Glu Met 
20 25 30 

Leu Trp Val Phe Asn Lys Ser Pro Gin lie Ser His Cys Met Ala Leu 
35 40 45 

Arg Leu Tyr Phe Pro Tyr Ser Leu Trp Gly Arg Arg Tyr 
50 55 60 

(2) INFORMATION FOR SEQ ID NO:299: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TO POLOGY : 1 i ne ar 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 299: 

Lys Ser Ala Gly lie Arg Phe Arg Ser Leu Ala Leu Leu Ser Gly Arg 
15 10 15 

Leu Ser Gly Thr Val Lys His 
20 

(2) INFORMATION FOR SEQ ID NO: 300: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 300: 

Arg Leu lie Asp Ser Phe Cys Lys Lys Thr Leu Lys Arg Arg Lys Pro 
15 10 15 

lie lie Phe Gly lie 
20 

(2) INFORMATION FOR SEQ ID NO: 301: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 301: 



Asn Glu Pro Gly Leu Lys Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 3 02: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 i ne ar 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:302: 

Asn Leu lie Leu Cys Ser Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO:303: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 303: 

Phe Arg His Leu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 3 04: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 304: 

Arg Arg His Phe Gly Leu Asp Tyr lieu Phe lie Phe Pro Phe Trp Leu 
15 10 15 

Leu Thr Cys Leu Phe Gin lie Tyr Cys Trp Leu Trp Gly 
20 25 

(2) INFORMATION FOR SEQ ID NO:305: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 305: 
Trp Cys Arg Arg 



(2) INFORMATION FOR SEQ ID NO: 3 06: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:306: 

Pro Phe His Tyr Arg Leu Ser Cys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 3 07: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 07 

Tyr Phe Tyr Trp Gin Ala Val Gly lie 
1 5 

(2) INFORMATION FOR SEQ ID NO:308: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 308 

His Trp Arg Asn Trp Tyr Arg Ala Phe His Glu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 309: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE : protein 
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(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 309: 

He Asp Ala He Cys Asn Ala Thr Phe Met Asp Arg Pro Phe Tyr Val 
15 10 15 

Tyr Ala Gly Ser Val Gly Gly He Gly Ser Trp Cys His Arg Lys Pro 
20 25 30 

Cys Ser Gly Leu Asp Ser Asn Thr Gly Pro Asn Ala Thr Val His Asp 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 310: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 310: 

He He Gly Asn Cys Asn Asn Leu Asn Gly Gin Leu Pro Met Ala 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 3 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 311: 

Arg Tyr Pro Val Glu Leu Tyr Pro Ala Asp Asn Val Thr Asn Trp Arg 
15 10 15 

Ala Trp Leu Asn Gly Thr Thr Gly Lys 
20 25 

(2) INFORMATION FOR SEQ ID NO: 312: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:312: 

Val Ala Tyr Cys He Gly Cys Gly Phe Tyr Ser Thr He Glu Pro Phe 
15 10 15 
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Phe He Thr Ser Leu He Lys Lys Trp Gin Phe Arg Gly Arg Thr Phe 
20 25 30 



Thr 

(2) INFORMATION FOR SEQ ID NO: 3 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 313: 

Trp Arg Ala Tyr Val Thr Tyr Leu Ser Asp He Thr Asn His Leu Pro 
15 10 15 

Ala Glu Asp Tyr Asp Ala Tyr Trp 
20 

(2) INFORMATION FOR SEQ ID NO: 3 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 314: 

Arg Leu Gin Leu Val Arg Val Ser His Trp Arg Gly Asp Tyr Trp Phe 
15 10 15 

Phe Asn Trp Val Leu Cys Gly Gly Ser Leu Leu Gly Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO: 3 15: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 315: 

Tyr Gly Gly Val Ser Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO: 3 16: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 316: 

Tyr Phe Thr Trp Arg Asp Asn Gly Tyr Asp lie Gin Phe Tyr Asn Arg 
15 10 15 

Ser 

(2) INFORMATION FOR SEQ ID NO: 3 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 317: 

Asn Leu Thr Phe Trp Leu Ala Phe Gin Pro Val Leu Val Cys Tyr Phe 
15 10 15 

Leu Tyr Lys Arg Arg His Gly Val Tyr lie Lys His Ser Val 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 3 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:318: 

Val lie Ser lie Phe Thr Thr Arg Ala Tyr Phe lie lie 
15 10 

(2) INFORMATION FOR SEQ ID NO: 319: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:319: 

Pro Ala lie Phe Lys lie Tyr Pro Gly Arg Val Glu Asn Ala Leu Ser 
15 10 15 

113 



He Met Tyr Gin Leu Leu Ser Ser Cys His Asn Met Tyr Gly He Ser 
20 25 30 



Arg Ser Gly Phe Arg Ser Phe Lys Ser Val Gly Thr Thr He Glu Cys 
35 40 45 

Val Phe Leu Leu Asn Ala Ala Gin Lys Tyr He Gly Ser Thr Asp Xaa 
50 55 60 

Leu He Ser Phe Pro Tyr Ala Leu His His Tyr Leu Val Glu Ser Asp 
65 70 75 80 

Lys Phe Tyr He Tyr Leu Lys Asp Trp Phe Pro Ser Val 
85 90 

(2) INFORMATION FOR SEQ ID NO: 320: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:320: 

Ala Arg Lys Gin Asn Ser Leu Gin Lys Arg Asn Tyr Val Met Ala Val 
15 10 15 

Arg Lys Gly Arg Leu Ser Lys Val Leu Lys 
20 25 

(2) INFORMATION FOR SEQ ID NO: 321: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 321: 

His His Tyr Phe Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO:322: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

{ C) STRANDEDNESS : S ingle 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:322: 
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Leu Arg Phe lie Cys lie Phe lie Ser Leu Leu Lys Arg 
15 10 



(2) INFORMATION FOR SEQ ID NO: 323: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 323: 

Leu Ser His Tyr Asn 
1 5 

(2) INFORMATION FOR SEQ ID NO:324: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:324: 

lie Asn His Phe Leu Met His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 325: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 325: 

Leu Leu His Cys Cys Phe Trp Ala Leu Gly 
15 10 

(2) INFORMATION FOR SEQ ID NO: 326: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : 1 inear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:326: 
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Leu Leu Leu Trp Val Ala Cys Phe Phe Arg Trp Gly Trp Leu Leu Pro 
15 10 15 



Ala Arg Pro Leu Val Leu Lys Ala Ser lie 
20 25 

(2) INFORMATION FOR SEQ ID NO: 327: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 327: 

Val lie Leu Ser Arg Tyr Ser Leu Tyr lie Ala 
15 10 

(2) INFORMATION FOR SEQ ID NO: 328: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 328: 

Asn Tyr Val Asn Pro Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO:329: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32 9: 

Lys Leu Ser Cys Tyr Leu Leu Ser Leu Pro Phe Ser Phe lie lie Met 
15 10 15 

Pro Val Leu Phe Gly Arg Tyr Arg Thr Val Gly 
20 25 

(2) INFORMATION FOR SEQ ID NO: 3 30: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 0: 

Pro Val Ala Cys Leu Trp Phe Leu Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 331: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:331: 
Asn Gly Tyr Gly 



(2) INFORMATION FOR SEQ ID NO: 332: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 332: 

Trp Phe Phe lie Ser Ser Leu Ala Tyr Trp Thr lie Leu Phe Asn lie 
15 10 15 

lie Arg Leu Glu Lys Leu Ser Lys Asn Glu 
20 25 

(2) INFORMATION FOR SEQ ID NO: 333: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:333: 

Arg Lys Thr Gly Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO: 334: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 amino acids 
<B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 334: 

Arg Ser Gly Gly Arg Pro Ser Asn Glu Asp Ala Ala Ser Glu Met Gin 
15 10 15 

Ser Glu lie Gin Ser Gly Ser Leu Ala Gin Ser Val Lys Gin Ser Val 
20 25 30 

Ala Val Val Arg Asn Pro Thr His lie Ala Val Cys Leu Gly Tyr His 
35 40 45 

Pro Thr Asp Met Pro lie Pro Arg Val Leu Glu Lys Gly Ser Asp Ala 
50 55 60 

Gin Ala Asn Tyr lie Val Asn He Ala Glu Arg Asn Cys He Pro Val 
65 70 75 80 

Val Glu Asn Val Glu Leu Ala Arg Ser Leu Phe Phe Glu Val Glu Arg 
85 90 95 

Gly Asp Lys He Pro Glu Thr Leu Phe Glu Pro Val Ala Ala Leu Leu 
100 105 110 

Arg Met Val Met Lys He Asp Tyr Ala His Ser Thr Glu Thr Pro 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 335: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION SEQ ID NO:335: 

Met Leu Leu Val Cys Phe Phe Arg Pro Leu Arg Arg Leu Arg Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO:336: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : 1 inear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:336: 



Arg He Glu Gin Cys Leu Thr He Lys Val Arg Asp 
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15 10 

(2) INFORMATION FOR SEQ ID NO: 337: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 337: 

Ser Leu Leu Ala Trp His Lys His Gin He Ala Tyr Tyr Lys He Lys 
15 10 15 

Gin Asp Asn Gly Leu Val Arg Leu Asn Gly Leu Glu Pro Leu Asp Pro 
20 25 30 

His His Val Lys Val Val Leu 
35 

(2) INFORMATION FOR SEQ ID NO: 338: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 338: 

Pro Thr Glu Leu 
1 

(2) INFORMATION FOR SEQ ID NO: 339: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:339: 

Thr Ala Thr Leu 
1 

(2) INFORMATION FOR SEQ ID NO: 340: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
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(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 340: 

Val Thr Thr Gly Thr Asn lie Ser Val Thr Thr Ala Met Arg Gin Glu 
15 10 15 

Gly Asn Arg Asn Phe Leu Pro Glu lie Thr 
20 25 

(2) INFORMATION FOR SEQ ID NO: 341: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 341: 

Leu Arg Trp Lys Tyr Ala Thr Cys Arg Glu Asn Ser Arg His Ala Thr 
15 10 15 

Ala lie Val Val Leu Ser Glu Arg Ala Ala Lys 
20 25 

(2) INFORMATION FOR SEQ ID NO: 342: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 342: 

Trp Arg Thr Ala Asp Val Val Asp Ser Ala Ser Val Ala Ser Leu Thr 
15 10 15 

Pro Pro Pro Arg Ser Gly Arg 
20 

(2) INFORMATION FOR SEQ ID NO:343: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 343: 



Thr Pro Ser Arg Ser Leu Pro Val Pro Tyr Asp Pro Pro Pro Asn Pro 
15 10 15 
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Leu Thr Pro Gly Tyr Asn Arg Trp Val Asn Leu Thr Pro Ser Arg Arg 
20 25 30 



(2) INFORMATION FOR SEQ ID NO:344: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 344: 

Lys Arg Trp Asn Ala Tyr Leu Tyr Asn Arg Ala Glu Tyr Arg Cys Arg 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 345: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 5: 

Ser Arg Lys Ser Gly Lys Pro Gin Arg Ala Ala Leu lie Ala Ala Ser 
15 10 15 

Ala Thr Thr Ser Gly Leu Ser Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 34 6: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 346: 

Ser Lys Ala lie Cys Leu Arg Arg Val Thr Val Lys lie Ala Val Thr 
15 10 15 

Thr Ala lie Gin Met Pro Thr Pro Lys Pro Val Arg Ala Ala Phe Ala 
20 25 30 

His Pro Ala Leu Ser Pro Gly Pro Asp Arg 
35 40 

(2) INFORMATION FOR SEQ ID NO: 347: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 347: 

Pro Thr Arg lie Val Thr Ala Ala Ala Ser Asp lie Gly Ser Thr Asn 
15 10 15 

lie Ser Glu Leu Lys Leu Ser Ala lie 
20 25 

(2) INFORMATION FOR SEQ ID NO: 348: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 8: 

Pro Ala Thr Ser Thr lie Pro Asn Gly Glu Thr Ser Ser Ala Thr Thr 
15 10 15 

Ala Asn Asn Val Thr Ser Lys Asn Ser Gin Arg Asn Arg Gin Pro Gin 
20 25 30 

Leu Asn Gin Ala Leu His Asp Asp Ala lie Gly Phe Ala Lys Ala Phe 
35 40 45 

Phe lie Thr Asn lie Thr His 
50 55 

(2) INFORMATION FOR SEQ ID NO: 34 9: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 151 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 349: 

Thr Arg Val Phe Asn Val Arg Lys His Gly Asp Lys His His Pro lie 
15 10 15 

Asp Arg Arg Ser Arg Asn Ala Ala Ala Asp Thr Ala Glu Phe Arg His 
20 25 30 



Thr Lys Met Ala lie Asp Lys Asn lie Val His Arg Asn lie His Gin 
35 40 45 
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Gin Ala Gin Lys Ser His His His Thr Arg Phe Gly Phe Gly Gin Thr 
50 55 60 



Phe Ala Leu Val Ser Arg Tyr Leu Lys Glu Lys Val Ser Cys Ala Pro 
65 70 75 80 

Gin Gin Arg Ala Lys lie Thr His Gly Phe lie Gly Gin Arg Arg He 
85 90 95 

Asn He Met His Arg Ala Asp Asn Val Ser Gly He Pro Gin Asp Asp 
100 105 110 

His His Gin His Gly Asp Lys Ala Arg Gin Pro Glu Pro Leu Ser Asn 
115 120 125 

Leu Met Arg Asp Thr Leu Thr Thr Ala Gly Ala He Glu Leu Arg Asn 
130 135 140 

His Arg Arg Gin Gly Gin Gin 
145 150 

(2) INFORMATION FOR SEQ ID NO: 350: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 350: 

Ala Val Thr Lys Gin Asn Gly Gly Lys Gin He Glu Val Pro He Ala 
15 10 15 

Thr Ala Ala Met Ser Val Ala Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 351: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 351: 

Pro Pro Ala Met Thr Val Ser Thr Asn Pro Leu Arg Ser He Pro Leu 
15 10 15 

Ala Gin Gly Ser Pro Val Ser Glu Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO: 3 52: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 352: 

Leu Thr Arg Phe Thr Gly lie Leu Leu His Val Phe Thr Phe Tyr Phe 
15 10 15 

Val Val He 

(2) INFORMATION FOR SEQ ID NO: 353: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 353: 

Lys Thr Lys Lys Pro Pro Lys Trp Gin Pro Lys Glu He Ala Gly Glu 
15 10 15 

He Ser Val Tyr Cys Ser Gly Val Leu Leu Phe Leu Gin 
20 25 

(2) INFORMATION FOR SEQ ID NO:354: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 354: 

Lys Asn Ser Cys 
1 

(2) INFORMATION FOR SEQ ID NO: 3 55: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 355: 
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Arg Arg lie Ala Gly Lys Leu Phe Phe His Leu Leu Leu Cys 
15 10 



(2) INFORMATION FOR SEQ ID NO: 356: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 356: 

Thr Val Leu Leu Leu Phe lie Ser Gly Val Glu Asp Met Phe Thr Gly 
15 10 15 

lie Val Gin Gly Thr Ala Lys Leu Val Ser lie 
20 25 



(2) INFORMATION FOR SEQ ID NO: 357: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 357: 

Ala Glu Pro Ser Gin Glu Gin lie Asn Phe Phe Glu Gin Leu Leu Lys 
15 10 15 

Asp Glu Ala Ser Thr Ser Asn Ala Ser Ala Leu Leu Pro Gin Val Met 
20 25 30 

Leu Thr Arg Gin Met Asp Tyr Met Gin Leu Thr Val Gly Val Asp Tyr 
35 40 45 

Leu Ala Arg lie Ser Arg Arg Ser Met Pro Ser Ala 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 3 58: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 358: 

His Gly Met Lys Val His Arg lie Val Phe Leu Thr Val Leu Thr Phe 
15 10 15 
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Phe Leu Thr Ala Cys Asp Val Asp Leu Tyr Arg Ser Leu Pro Glu Asp 
20 25 30 



Glu Ala Asn Gin Met Leu Ala Leu Leu Met Gin His His lie Asp Ala 
35 40 45 

Lys Lys Asn Arg Lys Arg Met Val 
50 55 

(2) INFORMATION FOR SEQ ID NO: 359: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
<iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 359: 

Pro Tyr Val Ser Ser Ser Arg Gin Phe He Asn Ala Val Glu Ala Thr 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 360: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : S ingle 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36 0: 

Arg Leu Ser Ala 
1 

(2) INFORMATION FOR SEQ ID NO: 361: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 361: 

Gly Ser Leu Gin Arg Arg He Arg Cys Phe Arg Leu He Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO:362: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:362: 

Trp Tyr His Pro Arg Lys Asn Arg Gin Lys lie Asn Phe Leu Lys Glu 
15 10 15 

Gin Arg lie Glu Gly Met Leu Ser Gin Met Glu Gly Arg Asp 
20 25 30 

(2) INFORMATION FOR SEQ ID NO:363: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:363: 

Pro Leu Arg Tyr Arg Leu Met Met Arg Glu Val Thr Leu Leu Arg Ala 
15 10 15 

Gin Leu Pro Tyr Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 364: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 364: 

Asn lie His Leu Arg Ser lie Trp Arg Pro Phe Gly 
15 10 

(2) INFORMATION FOR SEQ ID NO: 3 65: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 365: 

Lys Leu Lys lie 
1 
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(2) INFORMATION FOR SEQ ID NO: 366: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 366: 

Arg Cys Gin Ser Leu Gly Cys Asn Thr Val Arg Leu Val Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 367: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 367: 

Cys Ser Leu Leu Asn Ser Glu Trp 
1 5 

(2) INFORMATION FOR SEQ ID NO:368: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 368: 

Leu Thr Tyr Pro Arg Asp Lys His Ser Gly Leu Trp Thr Leu Ser Thr 
15 10 15 

Pro lie Lys Gly Arg Trp 
20 

(2) INFORMATION FOR SEQ ID NO: 369: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 369: 



Asn Thr Leu lie Arg 
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(2) INFORMATION FOR SEQ ID NO: 370: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 i ne ar 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 370: 
Gin Asp Cys Tyr 



(2) INFORMATION FOR SEQ ID NO: 371: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 371: 

Glu Trp Ala Ser 
1 

(2) INFORMATION FOR SEQ ID NO:372: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:372: 

Ser Ala lie Phe Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO: 373: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TO POLOGY : 1 i ne a r 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 373: 



Asp Ala Val Phe Glu Pro Thr 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 374: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

CO STRANDEDNESS : single 
( D ) TOPOLOGY : 1 inear 
(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:374: 

Ser Arg Gly Val Ala Thr Leu Ser Leu Phe Leu Ala Thr Cys Ser Leu 
15 10 15 



Arg Cys Thr Gly Met Ala Gly 
20 



(2) INFORMATION FOR SEQ ID NO: 375: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 384 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 375: 

Ala Gly Leu Ser Ser Ser Asn Cys Trp Arg Tyr Gly Asp Arg Pro Glu 
15 10 15 

Leu Asp Arg Leu Leu Asp Arg Ala Leu Asn Arg Leu Arg Gly Ser Ser 
20 25 30 

Val lie Pro Ala Cys Leu Asn Asp Arg Gin Lys Arg Gin Val Arg Leu 
35 40 45 

Ala Pro Arg lie Ser Ala Phe Ala Phe Gly Leu Gly Leu Phe Lys Leu 
50 55 60 

Arg Cys Ser Asp Tyr Phe Met Leu Pro Glu Tyr Arg Gin Leu Leu Leu 
65 70 75 80 

Gin Trp Phe Ser Glu Asp Glu lie Trp Gin Leu Tyr Gly Trp Leu Gly 
85 90 95 

Gin Arg Asp Gly Lys Leu Leu Pro Pro Gin Val Met Gin Gin Thr Ala 
100 105 HO 

Leu Gin lie Gly Thr Ala lie Leu Asn Arg Glu Ala His Asp Asp Ala 
115 120 125 

Gly Phe Thr Cys Ala He Ser He He Thr Pro Ser Ala Ala Tyr Thr 
130 135 140 



Leu Ala Glu Asp Phe Ser Tyr Arg Asp Tyr Leu His Gly Ala Phe Ala 
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145 



150 



155 
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Met Ser Phe Thr Ser Leu Pro Leu Thr Glu lie Asn His Lys Leu Pro 
165 170 175 

Ala Arg Asn lie lie Glu Ser Gin Trp lie Thr Leu Gin Leu Thr Leu 
180 185 190 

Phe Ala Gin Glu Gin Gin Ala Lys Arg Val Ser His Ala lie Val Ser 
195 200 205 

Ser Ala Tyr Arg Lys Ala Glu Lys He He Arg Asp Ala Tyr Arg Tyr 
210 215 220 

Gin Arg Glu Gin Lys Val Glu Gin Gin Gin Glu Leu Ala Cys Leu Arg 
225 230 235 240 

Lys Asn Thr Leu Glu Lys Met Glu Val Glu Trp Leu Glu Gin His Val 
245 250 255 

Lys His Leu Gin Asp Asp Glu Asn Gin Phe Arg Ser Leu Val Asp His 
260 265 270 

Ala* Ala His His He Lys Asn Ser He Glu Gin Val Leu Leu Ala Trp 
275 280 285 

Phe Asp Gin Gin Ser Val Asp Ser Val Met Cys His Arg Leu Ala Arg 
290 295 300 

Gin Ala Thr Ala Met Ala Glu Glu Gly Ala Leu Tyr Leu Arg He His 
305 310 315 320 

Pro Glu Lys Glu Ala Leu Met Arg Glu Thr Phe Gly Lys Arg Phe Thr 
325 . 330 335 

Leu He He Glu Pro Gly Phe Ser Pro Asp Gin Ala Glu Leu Ser Ser 
340 345 350 

Thr Arg Tyr Ala Val Glu Phe Ser Leu Ser Arg His Phe Asn Ala Leu 
355 360 365 

Leu Lys Trp Leu Arg Asn Gly Glu Asp Lys Arg Gly Ser Asp Glu Tyr 
370 375 380 



(2) INFORMATION FOR SEQ ID NO: 376: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 376: 

Asp Lys Asn Asp Ala Pro Tyr Ser He Tyr Pro Trp Pro Gly Tyr Arg 
15 10 15 
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Gly Thr Arg Gly Tyr Phe Ala Phe Asn Val Ser Ser Pro Gly Val Thr 
20 25 30 



Gly Asn Asp Gly Gly Ser Ala Leu 
35 40 

(2) INFORMATION FOR SEQ ID NO: 377: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:377: 

Asp Asp Gly Arg Asn Arg Asn Gly Ala Glu Trp 
15 10 

(2) INFORMATION FOR SEQ ID NO: 378: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:378: 

Thr Ala Arg Lys Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 379: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:379: 

Glu Thr Gly Ala Gin Thr Ala Gly Phe Ala Ala Phe Asp Lys Thr Asn 
15 10 15 

Thr Gly Gly 

(2) INFORMATION FOR SEQ ID NO: 380: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : 1 inear 
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(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 0: 

Trp Gly Asn Val Ala Ser Ala Tyr Arg Arg Glu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 381: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 381: 

Phe Thr Glu Cys Val Ser Asn Tyr Arg Ser Cys Asn Gly Ala Tyr Cys 
15 10 15 

Arg Arg Val Val Lys Lys Glu Lys Thr Arg Phe Ala He Ala Thr Gly 
20 25 30 

Tyr Val Thr Ala Glu Glu Gly Trp Glu Leu Ala Val Phe Ser Leu Leu 
35 40 45 

Glu Leu Gly Glu Val Asp Thr Val Arg Cys Pro Leu 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 382: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 382: 

Ser Val Leu Cys Asn Arg Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 383: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 ine ar 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 383: 

Thr Thr Met Lys Cys Pro Tyr Arg Ser Gly Ser Asp Ala Trp Gin Thr 
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1 



5 



10 



15 



Gly Arg He Ala Val Asn Gly Ser Val Phe Cys 
20 25 

(2) INFORMATION FOR SEQ ID NO: 384: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 384: 

Pro Leu Asn Leu Ala Tyr Ala Ser Asn Pro Arg Ser Lys Val Val Trp 
15 10 15 

Pro Gin His 

(2) INFORMATION FOR SEQ ID NO: 385: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 385: 

Tyr Val Cys Val Val Cys Cys Tyr Ser Leu Ala Leu Lys Lys Ser Ala 
15 10 15 

Ser Val Arg Ser Gly Phe Ala Ser Cys Arg Leu He His Tyr Cys Arg 
20 25 30 

Tyr Tyr Ser He Leu Phe Val Ser Ala Gly Phe Ser Val He Gly Cys 
35 40 45 

Leu He Asp Leu Pro Leu 
50 

(2) INFORMATION FOR SEQ ID NO: 386: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 386: 



Phe Leu His Arg Arg Cys Ser He Gly Tyr Ser Asn Asn Leu Met Arg 
15 10 15 
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Ser Leu Cys 



(2) INFORMATION FOR SEQ ID NO: 3 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 387: 

Tyr Pro lie Thr Val Leu Thr Thr Lys lie Asn Val Asn Lys Phe Ser 
15 10 15 

Lys Arg Phe Val Lys 
20 

(2) INFORMATION FOR SEQ ID NO: 388: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:388: 

lie Arg Phe Tyr Ser Asp Thr Trp Leu Ser lie Phe Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 389 : 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 389: 

lie Gly Phe Leu Ala His His Glu Ala Ser Gly Trp lie Gly lie Ser 
15 10 15 

Leu Leu Asn Val lie Phe Ser Phe Leu Phe Asn 
20 25 

(2) INFORMATION FOR SEQ ID NO: 3 90: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 390: 
Leu Asn Gly Leu 



(2) INFORMATION FOR SEQ ID NO: 3 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 391: 

Gin Pro lie Cys Ser Gly Gly Lys Asp Asn Met Lys Leu Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 3 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 392: 

Arg Ser Tyr Ser Leu Met Ser Asp Thr Gin Ala Asn Leu Leu Arg Arg 
15 10 15 

Arg Thr Ala Phe 
20 

(2) INFORMATION FOR SEQ ID NO: 3 93: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 55 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 393: 

Leu Glu Thr Arg Ser Val Pro Gly Tyr Ser Ser Thr lie lie Phe Val 
15 10 15 

Ala Arg Trp Ala Cys Asn Glu Leu Phe Ser Thr Ser Phe Gin Leu Arg 
20 25 30 

Arg Ala Leu Val Thr lie Thr Ser Ser Thr Asn Lys lie Xaa Trp Ser 
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35 



40 



45 



Arg Asn Ala Phe Met Val Arg 
50 55 

(2) INFORMATION FOR SEQ ID NO: 3 94: 
<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 9 amino acids 

(B) TYPE: amino acid 

<C) STRAND EDNESS : single 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 394: 

Gly Ser Gin Gly Ala Thr Val Ala Gin Cys Met Arg Gly Ser Ala Gly 
15 10 15 

Tyr Gly Ser Gly Asp Gly Val lie Asn Arg Tyr Cys Asp Asp Ala Val 
20 25 30 

Thr Leu Ala Asp Leu Asp Gly 
35 

(2) INFORMATION FOR SEQ ID NO: 395: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 395: 

Tyr Pro Asp Tyr Tyr Gin Pro Tyr Val Phe Ser Asp Pro Ala Leu Asn 
15 10 15 

Cys Tyr Leu Ser 
20 

(2) INFORMATION FOR SEQ ID NO: 396: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:396: 

Pro Ser Arg Phe He Gly He Ser Val Phe He Thr Tyr Tyr Tyr He 
15 10 15 

He Ser Phe Val Thr His Asn Gin His He Thr Ala Gly Thr Val Thr 
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30 



Thr 

(2) INFORMATION FOR SEQ ID NO: 3 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:397: 

Tyr Cys Gly Cys Phe Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO:398: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:398: 

Val Cys Arg Arg Arg Lys Ser His Arg Trp Val Gly Arg lie Tyr His 
15 10 15 

His Tyr Tyr Arg Ala lie Tyr Cys His Tyr Lys Arg Tyr Arg Glu Gly 
20 25 30 

Gly Gly Ser 
35 

(2) INFORMATION FOR SEQ ID NO: 3 99: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 399: 

Arg Thr Phe Leu Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4 00: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

138 



(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 400: 

Trp Asp Ala Arg Gin Thr Asn Glu Tyr Arg Trp Arg Phe Ala Cys Arg 
15 10 15 

Ser Tyr Arg Cys Arg Pro Cys Pro Tyr He Lys Thr Ala Cys Pro Ala 
20 25 30 

Gly Lys Pro Leu Ser Arg Cys Asp Gly Arg Cys Asp Glu He Cys 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 401: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 ine ar 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 401: 

Arg Arg Tyr Asp Cys Arg Tyr Tyr Cys Cys Ser Gly Glu His Tyr Arg 
15 10 15 

Arg Tyr His Tyr Arg Tyr Arg Thr He 
20 25 

(2) INFORMATION FOR SEQ ID NO: 402: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : S ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:402: 
Tyr Val Asp Glu 



(2) INFORMATION FOR SEQ ID NO: 403: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TO POLOG Y : 1 i ne ar 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:403: 
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Gly Cys Ser His Leu 
1 5 



(2) INFORMATION FOR SEQ ID NO: 4 04: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:404: 

Arg Thr Val Asn Arg Arg Trp Phe Met Trp Ala Asn Ser lie Ala Ala 
15 10 15 

Asp Phe Pro 

(2) INFORMATION FOR SEQ ID NO: 4 05: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 405: 

Arg Gly Asn Tyr Cys His Pro Cys Pro Gly 
15 10 

(2) INFORMATION FOR SEQ ID NO: 406: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:406: 

Glu Thr Pro Glu Pro Gly Asp Arg Val Glu Phe Ser Asn Cys Gin Thr 
1 5 " 10 15 

Thr Ser Val Ala His He Asn Arg Cys Gly Phe Asn Ala Pro Arg Phe 
20 25 30 

Asn Ser Trp Leu Ser Phe Tyr His Ser Arg Phe Leu Phe Ser Val Val 
35 40 45 

Ser He Ala Asn Tyr Pro His Ser Pro Gin Lys Val Cys Gly Phe Arg 
50 55 60 

Lys Trp Arg Arg Ser Thr Gly Lys Arg 
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(2) INFORMATION FOR SEQ ID NO: 4 07: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 i ne ar 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 407: 

Tyr Gly Ser Arg Arg Met Ser Ser Asn Leu Thr Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO:408: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 408: 

Pro Asp Val Thr Phe Cys Arg Pro Asp Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 4 09: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 409: 

Arg His Glu Met Val Phe lie 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:410: 



Gly Tyr Arg Arg Pro Ser Pro 
1 5 



141 



(2) INFORMATION FOR SEQ ID NO: 411: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:411: 

Thr His Arg Lys lie Asp Gly Thr Ala lie Ser Gly Thr Arg lie 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 4 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:412: 

Phe lie Tyr Ser Arg Ser Gly Gly Leu Phe lie Asp Arg Arg Gly Arg 
15 10 15 

(2) INFORMATION FOR SEQ ID NO:413: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 413: 

Gin Pro Asp Val Thr Glu Arg Asp Gly Ala Asp Leu Leu Ala Tyr Lys 
15 10 15 

Arg His Gly Pro 
20 

(2) INFORMATION FOR SEQ ID NO: 4 14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 414: 
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Gly Ala Arg Phe Trp Thr Gly Arg Phe Arg Gly Gin Pro Thr Tyr Leu 
15 10 15 



Cys Leu He Lys Met Cys Pro Ala Ser Ala Tyr Gly Arg Val Tyr Trp 
20 25 30 

Cys Ser Gly Asn Ala Leu Ser Asn Glu Cys Asp Gly Lys Lys Leu Leu 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 415: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
{iii) HYPOTHETICAL: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 415: 

Ala Gly Glu Arg Ala Ser Ala Pro Val Thr His 
15 10 

(2) INFORMATION FOR SEQ ID NO: 4 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 416: 

Asn Phe Ala Thr Ala Cys He Arg Ala Gly Phe Tyr 
15 10 

(2) INFORMATION FOR SEQ ID NO: 417: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 417: 

Arg Phe Thr Ser Tyr Phe Arg His Leu Asn 
15 10 

(2) INFORMATION FOR SEQ ID NO: 418: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:418: 

Leu Gly Ala Thr 
1 

(2) INFORMATION FOR SEQ ID N0:419: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 419: 

Lys Arg Cys Pro Asp Val Asp Arg lie Cys Pro Tyr Arg Ala Ser Ser 
15 10 15 

Ser Tyr Ser Ala Ser Ser 
20 

(2) INFORMATION FOR SEQ ID NO: 420: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:420: 

Ser Gly Arg Lys Thr Ala Ala Asp Phe Ala Asp Arg Arg Arg Tyr 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 421: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 421: 

Lys Pro Arg Ala 
1 

(2) INFORMATION FOR SEQ ID NO:422: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 
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(C) STRAND EDNESS : single 

(D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 422: 

He His Ser Pro Asp Gly Asn Gly Asp Leu Tyr Cys Ala Val Val Ser 
15 10 15 

Ser 

(2) INFORMATION FOR SEQ ID NO:423: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:423: 

Asp Ala Asp Pro Ala Thr Tyr Arg Ala Gly Ala Glu Ala Val Ser Gin 
15 10 15 

He He His Cys His Phe Cys Arg His Pro Thr Phe Leu Ala Lys Asn 
20 25 30 

Tyr Arg Ser His Leu Val Arg Arg Thr Asp Phe Val Met Ala Gly He 
35 40 45 

Arg Arg Gly Glu Pro Tyr Thr Ser Gly Arg Lys Tyr 
50 55 60 

(2) INFORMATION FOR SEQ ID NO:424: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:424: 

Arg Arg Gly Val Gly Gly Gin 
1 5 

(2) INFORMATION FOR SEQ ID NO: 425: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 425: 

Asn lie Arg Pro Pro Met Val lie Val Asp Gly Ala Glu Phe Arg Met 
15 10 15 

Ser Ala Gin Arg Cys 
20 

(2) INFORMATION FOR SEQ ID NO: 426: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL; NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:426: 

Met Arg Gly Cys Leu Gly Tyr Leu Trp Ala Ser Cys Ala Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 427: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 427 : 

Ser Leu Glu Lys Asn Leu Leu Lys Ser Trp Gly Leu Met Ala Ala Lys 
15 10 15 

Leu Cys Tyr Leu Leu Leu Arg Val Gin Ser Gly Phe Thr Ala Gly Ser 
20 25 30 

Lys 

(2) INFORMATION FOR SEQ ID NO:428: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:428: 

Ala Thr Pro Ser Gly Ser Arg Gly Arg Ser Val lie Arg Ala Ser Tyr 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 42 9: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:429: 

Trp Leu Trp Ser Ser Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 430: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 430: 

Trp Pro Arg Thr Ala Arg Arg Leu Leu Glu Arg Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 4 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:431: 

Cys Asn Ala Ser Ser Arg Asn Gly Ser Thr Ala Tyr His Ser Thr lie 
15 10 15 

Asn Asp Gly Asp Ser Arg Tyr 
20 

(2) INFORMATION FOR SEQ ID NO: 4 32: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:432: 

Arg Cys Asp Leu Trp Arg Arg Ala Thr Ser Gly Tyr Phe Phe Cys Ser 
15 10 15 
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Trp Arg Gly Glu Lys His Ala Ser Gly Asp Ala Val 
20 25 



(2) INFORMATION FOR SEQ ID NO:433: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:433: 

Cys Ala Arg Arg Arg Gin Gin Cys Ser Gly Val Asn Trp 
15 10 

(2) INFORMATION FOR SEQ ID NO: 434: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 434: 

Thr Trp Thr Arg Ser Pro Arg lie His Arg Phe Tyr Thr 
15 10 

(2) INFORMATION FOR SEQ ID NO: 435: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 435: 

Arg Asp Pro Lys Thr Leu Cys His Cys Cys Arg Asn Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 4 36: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:436: 
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Gin Thr Arg Leu Arg Ala Arg Glu Gly Ala Val Cys Gly His His Asp 
15 10 15 



Ser Arg lie Phe Ser Arg 
20 

(2) INFORMATION FOR SEQ ID NO: 437: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 437: 

Trp Lys Ala Ser Arg Leu Ala Cys Arg Leu Thr Asp Ala Leu Cys Gin 
15 10 15 

Gly Arg Thr Glu lie Ala Leu Ala Pro Glu Arg Pro Arg Phe Leu Glu 
20 25 30 

Asn lie Ala Arg Arg lie 
35 

(2) INFORMATION FOR SEQ ID NO: 438: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:438: 

Cys He Ala Thr Thr Phe Arg Thr Tyr Gly Asn Gly Arg Lys Arg Gin 
15 10 15 

Tyr Tyr Arg He Leu Tyr Gly Thr Gly Gly Arg Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO: 439: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:439: 

Ser Arg Trp Arg Met Lys Ser Val His Cys Leu Met Asp He Leu Tyr 
15 10 15 
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Tyr Pro Asp Gly Leu Gin Arg Gly Gly lie lie Leu Pro Leu Thr Cys 
20 25 30 



Trp Gin Arg Ser Ala Ala Phe Phe Gin Ser Leu Pro Ala Met Ser lie 
35 40 45 

Val Asn Trp Arg Arg Tyr Cys Asp Gly Ala Trp Arg Phe Thr Arg Arg 
50 55 60 

Leu Asn Cys 
65 

(2) INFORMATION FOR SEQ ID NO: 440: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 0: 

Tyr Ala Leu Gly Asn Thr Ser Glu Glu Leu lie Gin lie Leu Thr Lys 
15 10 15 

Pro Leu lie Pro lie Arg lie Phe Ala His Phe Cys Asp Lys Val Arg 
20 25 30 

Met Lys Tyr Ala Asp Pro Ser Tyr Leu 
35 40 

(2) INFORMATION FOR SEQ ID NO: 441: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 441: 

Lys Asn Tyr Thr Lys Tyr Ser Pro Ser Asp His Gly Asn Phe Ala Gly 
15 10 15 

Asp Asn Arg Ala Ala Glu Lys Gin Leu Arg Gly Lys Leu Thr Val Leu 
20 25 30 

Asp Gin Gin Gin Gin Ala He He Thr Glu Gin Gin He Cys Gin Thr 
35 40 45 

Arg Ala Leu Ala Val Ser Thr Arg Leu Lys Glu Leu Met Gly Trp Gin 
50 55 60 

Gly Thr Leu Ser Cys His Leu Leu Leu Asp Lys Lys Gin Gin Met Ala 
65 70 75 80 
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Gly Leu Phe Thr Gin Ala Gin Ser Phe Leu Thr Gin Arg Gin Ala Val 
85 90 95 



Arg Glu Ser Val Ser Ala Ala Cys Leu Pro Ala Lys Arg He Thr Glu 
100 105 HO 

Glu Phe 

(2) INFORMATION FOR SEQ ID NO: 4 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:442: 

Cys Ala Tyr Glu Lys Glu Arg Lys Asn Tyr Tyr Gly He Lys Arg Cys 
15 10 15 

Val Leu Pro Lys Leu Arg Glu Val Leu Gly Cys His Ala Ser Leu He 
20 25 30 

Arg Met He Thr Arg Arg Arg Arg Asn Val Trp Thr Leu Asn Asn Ser 
35 40 45 

Cys Thr Arg His Tyr Pro Leu Val Arg He He Leu Leu Gin His 
50 55 60 

(2) INFORMATION FOR SEQ ID NO:443: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:443: 

He Arg Thr Trp Phe Ser Arg Asn Val He Val Leu Val Ala Val He 
15 10 15 

Leu Thr Val 

(2) INFORMATION FOR SEQ ID NO:444: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:444: 



Ser Val Lys Tyr Val Asn Gin Gly Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 445: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 445: 

Glu Ser Met Ser Leu lie Met Lys Phe Thr Val Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 446: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 446: 

Ser Ser Gly Trp Ser Leu Ser Cys Cys lie Trp Gly lie 
15 10 

(2) INFORMATION FOR SEQ ID NO:447: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 328 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 447: 

Phe Pro Trp Arg Tyr Ser Met Leu Arg lie Ala Asn Glu Glu Arg Pro 
15 10 15 

Trp Val Glu lie Leu Pro Thr Gin Gly Ala Thr lie Gly Glu Leu Thr 
20 25 30 

Leu Ser Met Gin Gin Tyr Pro Val Gin Gin Gly Thr Leu Phe Thr lie 
35 40 45 

Asn Tyr His Asn Glu Leu Gly Arg Val Trp lie Ala Glu Gin Cys Trp 
50 55 60 

Gin Arg Trp Cys Glu Gly Leu He Gly Thr Ala Asn Arg Ser Ala He 
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65 



70 



75 



80 



Asp Pro Glu Leu Leu Tyr Gly lie Ala Glu Trp Gly Leu Ala Pro Leu 
85 90 95 

Leu Gin Ala Ser Asp Ala Thr Leu Cys Gin Asn Glu Pro Pro Thr Ser 
100 105 110 

Cys Ser Asn Leu Pro His Gin Leu Ala Leu His lie Lys Trp Thr Val 
115 120 125 

Glu Glu His Glu Phe His Ser lie lie Phe Thr Trp Pro Thr Gly Phe 
130 135 140 

Leu Arg Asn lie Val Gly Glu Leu Ser Ala Glu Arg Gin Gin He Tyr 
145 150 155 160 

Pro Ala Pro Pro Val Val Val Pro Val Tyr Ser Gly Trp Cys Gin Leu 
165 170 175 

Thr Leu He Glu Leu Glu Ser He Glu He Gly Met Gly Val Arg He 
180 185 190 

His Cys Phe Gly Asp He Arg Leu Gly Phe Phe Ala He Gin Leu Pro 
195 200 205 

Gly Gly He Tyr Ala Arg Val Leu Leu Thr Glu Asp Asn Thr Met Lys 
210 215 220 

Phe Asp Glu Leu Val Gin Asp He Glu Thr Leu Leu Ala Ser Gly Ser 
225 230 235 240 

Pro Met Ser Lys Ser Asp Gly Thr Ser Ser Val Glu Leu Glu Gin He 
245 250 255 

Pro Gin Gin Val Leu Phe Glu Val Gly Arg Ala Ser Leu Glu He Gly 
260 265 270 

Gin Leu Arg Gin Leu Lys Thr Gly Asp Val Leu Pro Val Gly Gly Cys 
275 280 285 

Phe Ala Pro Glu Val Thr He Arg Val Asn Asp Arg He He Gly Gin 
290 295 300 

Gly Glu Leu He Ala Cys Gly Asn Glu Phe Met Val Arg He Thr Arg 
305 310 315 320 



Trp Tyr Leu Cys Lys Asn Thr Ala 
325 



(2) INFORMATION FOR SEQ ID NO:448: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
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(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:448: 

Tyr Ala Asn Asn He He Ala Phe Gin Val Val Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO:449: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:449: 

Glu He Gin Tyr Val Phe Thr Arg Phe Ala Phe Ala Thr Asp Trp Tyr 
15 10 15 

He Val Ser Ala Phe Asn Thr Ala Ser His Tyr Arg His Gly Asn Phe 
20 25 30 

Phe Pro 

(2) INFORMATION FOR SEQ ID NO: 450: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 59 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 450: 

Thr Gly Gly Gly He Phe Asp Phe Thr Lys Cys Ser Gly Tyr Ser Thr 
15 10 15 

Ser Pro Pro Lys Tyr Arg Thr Val Trp Pro Cys Ala Cys Thr Phe Leu 
20 25 30 

He His Tyr Gly Ala Asp Ala He Ser Cys Lys Arg Ala Leu Ala Ser 
35 40 45 

Gly Ser Gly Arg Trp Arg Ser Phe Leu Asp Val 
50 55 

(2) INFORMATION FOR SEQ ID NO: 451: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 451: 



Ser lie Ser Ala Leu Ser Thr Val Phe Ala Lys Lys Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO:452: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:452: 

Arg Glu Gly Ser Gin Leu Phe Ser Glu Phe Asp Lys Thr Asn Leu Ala 
15 10 15 

(2) INFORMATION FOR SEQ ID NO:453: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 453: 

Arg His Lys Lys Lys Asp Lys Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4 54: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:454: 

Phe Phe Ala His lie Asn Ser Gly lie Tyr Gly Glu Ser Val Asn Ala 
15 10 15 

Gly lie Ser Asp Trp lie Thr Tyr Leu Ser Ser Leu Ser Gly Tyr 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 4 55: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 455: 

Pro Ala Tyr Phe Lys Tyr Thr Ala Gly Tyr Gly Asp Asp Asp Gly Val 
15 10 15 

Ala Asp Asp His Phe lie Thr Val 
20 

(2) INFORMATION FOR SEQ ID NO: 456: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 110 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 456: 

Ala Ala Asn lie Phe Thr Gly Arg Arg Leu Gly Ser Asp Thr Gly Ala 
15 10 15 

lie Gly Thr Glu Leu Phe Met Asn Asp Ser Glu Leu Thr Gin Phe Val 
20 25 30 

Thr Gin Leu Leu Trp lie Val Leu Phe Thr Ser Met Pro Val Val Leu 
35 40 45 

Val Ala Ser Val Val Gly Val lie Val Ser Leu Val Gin Ala Leu Thr 
50 55 60 

Gin lie Gin Asp Gin Thr Leu Gin Phe Met lie Lys Leu Leu Ala lie 
65 70 75 80 

Ala lie Thr Leu Met Val Ser Tyr Pro Trp Leu Ser Gly lie Leu Leu 
85 90 95 

Asn Tyr Thr Arg Gin lie Met Leu Arg lie Gly Glu His Gly 
100 105 110 

(2) INFORMATION FOR SEQ ID NO: 4 57: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 i near 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:457: 

Met Ala Gin Gin Val Asn Glu Trp Leu lie Ala Leu Ala Val Ala Phe 
15 10 15 
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lie Arg Pro Leu Ser Leu Ser Leu Leu Leu Pro Leu Leu Lys Ser Gly 
20 25 30 



Ser Leu Gly Ala Ala Leu Leu Arg Asn Gly Val Leu Met Ser Leu Thr 
35 40 45 

Phe Pro lie Leu Pro lie lie Tyr Gin Gin Lys lie Met Met His lie 
50 55 60 

Gly Lys Asp Tyr Ser Trp Leu Gly Leu Val Thr Gly Glu Val lie lie 
65 70 75 80 

Gly Phe Ser lie Gly Phe Cys Ala Ala Val Pro Phe Trp Ala Val Asp 
85 90 95 

Met Ala Gly Phe Leu Leu Asp Thr Leu Arg Gly Ala Thr Met Gly Thr 
100 105 110 

lie Phe Asn Ser Thr lie Glu Ala Glu Thr Ser Leu Phe Gly Leu Leu 
115 120 125 

Phe Ser Gin Phe Leu Cys Val lie Phe Phe lie Ser Gly Gly Met Glu 
130 135 140 

Phe lie Leu Asn He Leu Tyr Glu Ser Tyr Gin Tyr Leu Pro Pro Gly 
145 150 155 160 

Arg Thr Leu Leu Phe Asp Gin Gin Phe Leu Lys Tyr He Gin Ala Glu 
165 170 175 

Trp Arg Thr Leu Tyr Gin Leu Cys He Ser Phe Ser Leu Pro Ala He 
180 185 190 

He Cys Met Val Leu Ala Asp Leu Ala Leu Gly Leu Leu Asn Arg Ser 
195 200 205 

Ala Gin Gin Leu Asn Val Phe Phe Phe Ser Met Pro Leu Lys Ser He 
210 215 220 

Leu Val Leu Leu Thr Xaa 
225 230 

(2) INFORMATION FOR SEQ ID NO: 458: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 458: 

Ser His Ser Leu Met Leu Phe He Thr He Trp Leu Lys Ala He Asn 
15 10 15 

Phe He Phe He 
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(2) INFORMATION FOR SEQ ID NO: 4 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:459: 

Lys Thr Gly Phe His Leu Tyr Glu Arg Glu Asn Arg Thr Ala Tyr Arg 
15 10 15 

Lys Glu lie Thr 
20 

(2) INFORMATION FOR SEQ ID NO: 460: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 460: 

Gly Arg Ala Gly Cys Gin Lys Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 461: 

Asn Asn lie lie lie Ser Ala Asp Cys Ala Leu Phe Val Phe Ser Phe 
15 10 15 

Leu Tyr 

(2) INFORMATION FOR SEQ ID NO: 462: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TO POLOG Y : 1 i ne a r 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:462: 



Lys Asp Asp Phe Asp Thr Asp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:463: 

Val Asn Asn Phe His lie Thr lie Ser Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 464: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 464: 

Thr lie Phe Leu Cys lie Asn Ala He Glu Ser Cys Phe Asn Arg Val 
15 10 15 

Thr Asp Phe Cys Thr Ala Val Ser Gly Arg Trp Gly Asn Ser Cys Tyr 
20 25 30 

Cys Gly 

(2) INFORMATION FOR SEQ ID NO:465: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 465: 

Arg Val Ser Ser Gly Gly Gly Gly Tyr Cys Gin Gin Gly His Trp Phe 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 466: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
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(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 466: 

Lys Arg Ala Tyr Lys Ser Gly Lys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 467: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 467: 

Ala Asp lie Leu Phe Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO:468: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 468: 

Arg Ser Arg lie Met 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4 69: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:469: 

lie Gin Pro Lys Ser Tyr His Ala lie Ser Tyr Leu Cys Leu Phe Leu 
15 10 15 

Leu Leu Leu Cys Gin Tyr Phe Ser Gly Ala Thr Val Leu Trp Val Ser 
20 25 30 



Leu Trp Arg Ala Cys Gly Phe Phe Phe Asn Lys Met Val Met Gly Arg 
35 40 45 
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Gly Asp Gly Phe Leu Tyr Arg Arg Trp His Thr Gly Leu Phe Phe Ser 
50 55 60 



lie Leu 
65 

(2) INFORMATION FOR SEQ ID NO: 470: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 470: 

Lys Ser Tyr Leu Lys Met Ser Lys Asp Asp Val Lys Gin Glu His Lys 
15 10 15 

Asp Leu Glu Gly Asp Pro Gin Met Lys Thr Arg Arg Arg Lys Cys Arg 
20 25 30 

Val Lys Tyr Lys Val Gly Val 
35 

(2) INFORMATION FOR SEQ ID NO: 471: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 471: 

Leu Asn Leu Leu Asn Asn Leu Leu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO:472: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:472: 

Cys Val He Gin Arg He Leu Arg Phe Val Leu Ala He He Pro Pro 
15 10 15 

He Cys Gin Tyr His Ala Ser Trp Lys Lys Ala Val Met Leu Lys Leu 
20 25 30 
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Thr lie Leu Leu Thr Ser Leu Asn Ala Thr Ala Ser Pro Leu Leu Lys 
35 40 45 



Met Leu Ser Trp Pro Ala His Tyr Phe Leu Lys Trp Asn Ala Glu lie 
50 55 60 

Lys Phe Leu Lys Arg Tyr Leu Asn Pro Leu Gin Pro Cys Tyr Val Trp 
65 70 75 80 

(2) INFORMATION FOR SEQ ID NO:473: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:473: 

He Met Arg He Leu Pro Lys His His Lys Cys Phe Trp Tyr Ala Ser 
15 10 15 

Ser Gly His Cys Glu Gly 
20 

(2) INFORMATION FOR SEQ ID NO:474: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:474: 

Glu Gly Asn Ser Val 
1 5 

(2) INFORMATION FOR SEQ ID NO: 475: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 475: 

Glu Thr Glu Asn Asn Arg Phe 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4 76: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:476: 

Pro Gly Thr Ser Thr Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 477: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 477: 

Arg lie lie Lys Leu Asn Lys lie Met Asp Trp Cys Val 
15 10 

(2) INFORMATION FOR SEQ ID NO:478: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 478: 

Met Asp Ser Asn His Ser Thr Pro Thr Met Ser Arg Trp Cys Ser Asn 
15 10 15 

Gin Leu Ser Tyr Glu Arg Gin Arg Cys Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO: 479: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 479: 

Gin Arg Gly Arg lie Leu Ala Ser Gin Pro Gin 
15 10 
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(2) INFORMATION FOR SEQ ID NO:480: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:480: 

Gly Lys Arg Glu lie Ala lie Phe Phe Leu Lys Ser Pro Asp Cys Gly 
15 10 15 

Gly Asn Met Gin His Val Glu Lys He Ala Ala Met Arg Arg Leu Ser 
20 25 30 

Ser Tyr Tyr Arg Ser Ala Leu Gin Asn Asp Gly Gly Arg Leu Thr Leu 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 481: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 481: 

He Ala His Pro 
1 

(2) INFORMATION FOR SEQ ID NO:482: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein' 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:482: 

His Arg Arg Arg Gly Gin Ala Asp Asp Glu Pro His Pro Glu Ala Cys 
15 10 15 

Arg Ser His Thr He His His Gin He Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO: 483: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 483: 

Arg Gin Asp lie Thr Ala Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 484: 

His Pro Val Gly Gly Lys Gly Asp Lys Lys Asp Gly Thr Arg lie Phe 
15 10 15 

lie Thr Ala Gin Asn Thr Ala Ala Asp Asn Leu Tyr Arg Val Gly Asn 
20 25 30 

Leu Val Asn Arg Ser Glu Gin His 
35 40 

(2) INFORMATION FOR SEQ ID NO: 485: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 485: 

Leu Arg Gin Ala Pro Arg Pro Gin Gly Cys His Cys Arg Ala Lys Gin 
1 5 • 10 15 

Tyr Ala Tyr Ala Glu 
20 

(2) INFORMATION FOR SEQ ID NO: 486: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:486: 

Pro Gin Pro Tyr Lys Cys Arg Arg Leu Asn Arg Tyr Ala Leu Arg Leu 
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1 



5 



10 



15 



Arg lie Gin Arg 
20 

(2) INFORMATION FOR SEQ ID NO:487: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:487: 

Ala Leu Ala Gin Thr Asp Asn Pro Leu Glu Ser Leu Pro Pro Gin Pro 
15 10 15 

Ala Thr Ser Ala Val Arg Thr Ser Ala Ser 
20 25 

(2) INFORMATION FOR SEQ ID NO:488: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:488: 

Ala Gin Ser Asp Asp Arg Arg His Pro Gin Tyr Leu Met Ala Lys Pro 
15 10 15 

Ala Ala Gin Arg Pro Gin lie Thr Ser Leu Gin Arg Thr Ala Ser Ala 
20 25 30 

lie Gly Asn Pro Ser 
35 

(2) INFORMATION FOR SEQ ID NO: 48 9: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:489: 

He Arg Arg Phe Met Thr Thr Leu Ser Gly Leu Pro Lys Pro Phe Ser 
15 10 15 

Leu Arg He Ser Arg He Glu Arg Ala Cys Leu Met 
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(2) INFORMATION FOR SEQ ID NO: 4 90: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 490: 

Glu Ser Met Ala lie Asn lie Thr Gin 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 91: 

Thr Ala Ala Val Ala Thr Pro Gin Pro lie Pro Pro Ser Ser Gly lie 
15 10 15 

Pro Lys Trp Pro 
20 

(2) INFORMATION FOR SEQ ID NO: 4 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: - NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 492: 

Phe Thr Gly He Phe Thr Ser Arg Pro Lys Asn Pro He Thr He Pro 
15 10 15 

Gly Leu Val Leu Ala Arg Pro Ser His Trp Phe Arg Ala Thr 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 4 93: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TO POLOG Y : 1 i near 

(ii) MOLECULE TYPE: protein 
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(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 493: 

Lys Lys Arg Tyr Pro Ala Pro His Ser Ser Ala Arg Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO:494: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

( D ) TOPOLOGY : 1 ine ar 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 494: 

Pro Thr Ala Leu Ser Ala Ser Ala Gly Ser lie Leu Cys lie Glu Arg 
15 10 15 

He Met Tyr Pro Ala Phe His Arg Thr He He Thr Ser Thr Glu Thr 
20 25 30 

Lys Pro Ala Ser Gin Asn Pro Cys Arg Thr 
35 40 

(2) INFORMATION FOR SEQ ID NO: 4 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 495: 

Cys Ala He Arg Ser Arg Arg Pro Glu Pro Leu Ser Cys Ala He Thr 
15 10 15 

Gly Val Lys Ala Ser Ser Lys Pro 
20 

(2) INFORMATION FOR SEQ ID NO: 496: 
(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:496: 



Pro Asn Lys Met Ala Gly Ser Arg 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 4 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:497: 

Arg Arg Gin Pro Cys Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 498: 

Arg Tyr Ser Leu Pro Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO:499: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 499: 

Arg Tyr Arg Arg Tie His Cys Gly Leu Tyr His Leu Arg Lys Asp His 
1 5 10 15 

Arg Tyr Leu Asn Ala Asn Asn 
20 

(2) INFORMATION FOR SEQ ID NO: 500: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 500: 
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Arg Ala Ser Leu Val Tyr Phe Cys Thr Tyr Ser Pro Phe He Leu Leu 
15 10 15 



Leu Tyr Glu Arg Leu Lys Ser Arg Arg Ser Gly Ser Gin Lys Lys 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 501: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 501: 

Gin Gly Lys Phe Gin Ser He Val Ala Gly Tyr Tyr Tyr Phe Ser Ser 
15 10 15 

Glu Lys Thr Val Val Asn Gly Ala Leu Leu Ala Ser Cys Phe Ser Thr 
20 25 30 

Cys Tyr Cys Ala Glu Gin Phe Cys Phe Tyr Leu Phe Gin Glu Leu Lys 
35 40 45 

He Cys Leu Arg Gly Ser Tyr Arg Val Pro Arg Asn Trp Tyr Arg 
50 55 60 
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